![]() |
The
Conversational Interface: Our
Next Great Leap Forward
(aka Conversational User Interface, Linguistic UI, Natural UI,
Spoken Dialog System, etc.)
© 2003-2007, John Smart. Reproduction, review and quotation encouraged with attribution.
|
Outline |
||||
|
On Phase Change Singularities: The Nature and Timing of CI Emergence
|
After the Symbiotic Age: Speculations on Autonomy and Beyond |
|||
|
Whatever we call it, we suggest this will be the next major,'internet-level' development, the next great information technology advance for our planet. For some recent technical books on the topic, you might enjoy Spoken Dialog Technology: Toward the Conversational User Interface, Michael McTear, 2004, and Practical Spoken Dialog Systems, Deborah Dahl (Ed), 2005. Microsoft has long been working on the software behind CI's at Microsoft Research. But the folks at Google may be much farther along in this area than Redmond. In fact, they may have an intrinsic functional and technological advantage with their distributed platform that will make the Google Browser (or at least, the Google "brain" behind everyone's browser) the OS of choice in coming decades. That is a fascinating realization, if it turns out to be true. With all their vigilance, intelligence, and wealth, it now seems plausible that Microsoft, like IBM before it, is on track to becoming a middleware player, and losing its global software dominance to the chaotic, creatively destructive forces of emergent technological intelligence. What an amazingly innovative time to be alive.
Most obviously, the CI will help us address the current global inequity of access to high quality, lifelong education in our increasingly technological world. But there is also compelling early evidence that CIs will help us discover better collective solutions in governance, globalization, environment, security, health, and productivity, among other domains. The arrival of a functional CI network will not entail full machine self-awareness, or what we might call "human-equivalent" artificial intelligence (AI), but is a transitional stage of advanced natural language processing (NLP), a field that deserves far greater funding and attention than it attracts today. NLP advances will combine with critically-needed improvements in bandwidth of connectivity and the hardware and software of simulations, so that our CI devices and humanlike agents will "talk" both to each other and to us, using data-rich semantic protocols, continually tuned by the interaction of hundreds of millions of humans with the system. What our browser of 2015 look like? For one thing, it seems clear now that it will have some very sophisticated software simulations of human beings as part of the interface. First world culture finally spends more on video games than movies, and this will apparently be a permanent feature of our world from this point forward. These "interactive motion pictures" are more compelling and educating, particularly to our youth, the fastest learning segment of our society, than any linear scripts, no matter how professionally produced. As we near the end of the Computer/Information Age then, we can observe that the dawn of the Era of Simulations, the first great implementation of virtual reality, has finally arrived. Now imagine that we have begun talking to our computers in a crude but useful verbal exchange circa 2015. It is becoming clear that we will not simply want to talk to a disembodied machine. We will want to relate to our favorite virtual human beings, as embodied agents will have an ability to nonverbally communicate, to frown or place their hand on their chin until they understand what we are telling them to do, to smile when they detect we are smiling at their jokes, to talk and act in calm and relaxing manner when their voice analyzers tell them we are upset, to speak more rapidly when they detect we are bored or hurried, etc. This parallel, nonverbal visual channel makes all our linguistic communication a lot more efficient: it's why face to face meetings are preferred over telephonic meetings for a wide range of interactive tasks. Thus our CI-equipped virtual avatars will model and display human emotion and body language, but given computer technology's far greater rates (at least one millionfold greater) of innate learning over slow-switching biological information processors like us, they will adapt to us and improve their usefulnes at a rate that will seem uncannily fast. In the process, our avatars (our "digital twins") will become our best filters of the complexity of our environments, and they will increasingly represent and motivate us, acting in more uniquely differentiated, creative ways than we can in the physical space. I will also argue that as they approach human level sophistication, we will naturally seek to use them to act more ethically than we do today, and this represents yet another great opportunity for growth as CI's develop in coming years. When can we expect the CI's emergence? In March 2005 Google's director of search and AC2004 change leader Peter Norvig noted that their average query is now about 2.5 words per query, by comparison to 1.3 on Alta Vista in its heyday, circa 1998. In subsequent email conversation with him he has told me that the actual number is "closer to 2.6 or 2.7." This is an initial doubling time of only seven years. In my opinion this average query length, averaged across all the leading search engines of the day (Google, Yahoo, MSN, etc.) will be one of the key numbers to watch to gauge the growing effectiveness of statistical natural language processing (statistical NLP) in creating a conversational front end for the internet and all our other complex technologies in the 21st century. My cursory research suggests English speakers use an average of 18 words per written sentence and 14 words per spoken sentence, and when we ask others questions, we pare this down further to something like 11 words. I would expect that as soon as our search tools get up over 8 words a sentence we'll start to see and expect emergent "pidgin" grammars in our computer's responses, and begin to feel like our computers have a primitive conversational intelligence. How long might this take? That's a big guess today, but given that it has taken approximately seven years to double from 1.3 to 2.6 words per average query, we might need another seven years, circa 2012, to get us to 5.2 words, a period I suspect would be just prior to the feeling of CI intelligence on the part of the user. Already, today, when you use Google's "near" operator, as in: "Coffee shops near Palo Alto", which returns a Google Map (Google's new optical cortex!) of yellow pins, all distributed around Palo Alto's city center, you are using a query length of four or five words. But this doesn't feel like conversational intelligence. That's where I suspect we'll be in 2012, a lot of folks using a lot of these simple operators (time, distance, etc.) but probably still mostly by keyboard, and still mostly thinking of the internet as a (wonderful, but not very intelligent) information appliance. Now step forward another seven years, to 2019, and in my estimation we will likely have doubled our search length yet again, to just over ten words per average query. Somewhere between 2012 and 2019 I expect we'll see voice recognition queries (many of them mobile) begin to compete with keyed entry, and a whole new level of sophistication (and user feedback/ranking/rating) of the average queries. This time, I think we have enough new functionality to create a"step function" in user experience, where the web no longer feels like just an information appliance, it now feels like a partner, a crude extension of our linguistic ability. That's what I would call the end of the Information Age and the start of the Symbiotic Age. From that point forward we'll begin to feel naked out in public without the web, the way we'd feel out in public without our clothes today. What is our evidence that this query length doubling will continue? It's weak today, but I think still worth watching closely. First, consider that seven years per doubling would be just over the six year doubling in average software productivity or general algorithmic efficiency quoted by Bill Joy and others in the IT industry as a rough "Moore's law for software." But doubled algorithmic power or efficiency alone would be unlikely to translate to doubled query lengths. We will need a lot more insight before we can make this claim. My main intuition in this regard is that the human conversation space is sharply finite, that the encoding of human conversation in easily spiderable form is doubling in volume at an enormous rate (roughly every two years, since the start of the web), and that our ability to rank the relative value of those encodings is steadily improving (Google's PageRank, Web 2.0, 3.0, etc.). In other words, I suspect that the human conversation phase space, in all languages, is beginning to become ergodic (computationally closed, with all the most useful and functional thoughts/ideas/sentences continually revisited on a random basis). Given the finite nature of the human conversation space and the rapidity with which our web publishing, email archiving, and other searchable online conversations are beginning cover it, I suspect the problem of serving up a useful natural language response to an online query is very much like codebreaking, a cryptographic problem that involves finding the set of primers, or translation elements, that are repeatedly used to transform one set of information (the world wide web of encoded ideas) into another (the set of most useful linguistic responses to queries about common human problems). If this is true, then the growth of search engine query length in coming years will, like many things in nature, follow a logistic curve (an "S" curve), which means its first phase of growth, before reaching the inflection point, will be roughly exponential, with an alleged doubling time of seven years. I suspect the saturation point in query length will come at some sentence length that is longer than the average human-to-human query length in spoken sentences (11 words, by my rough research). I suggest this because when you query Google, it is often to your advantage, even when asking technical questions today, to include additional words beyond those you'd normally ask any human in natural conversation. You use those specialized words with Google because you suspect (and it is increasingly true) that Google knows everything, unlike the average human. A presentation from one of my technology futures slide presentations below attempts to summarize this point:
If (a big if) these proposals and qualifiers are approxmately correct, this would place the CI's emergence circa 2015-2020. If anyone has better numbers or methods to suggest, please let me know, and help me to improve this page. Let me go on record proposing that the conversational interface will be the single most important technological innovation the average person alive today will witness in their lifetimes (out of a long list of competing innovations, like personal computers, automated supply chains, and cell phones). In terms of broad scientific, technological, economic, political, educational, and social impact on human society, I expect it will make even the emergence of the internet seem minor by comparision. The CI makes both the greatest wisdom of the species and its lowest common denominator distractions perennially accessible to all of us. It will surely be greatly misused in its early stages, but in the long run it will allow what we say, and hear, to bring us to a whole new level of conscious insight about ourselves and the world. If 2020 is our expected transition point, then CI's are a bit farther off than some of our most optimistic technology futurists would today have us believe. Yet they are also sooner than the naysayers, who tell us NLP is riddled with near-insoluable problems, and who don't understand how far we've advanced already with simple statistically based systems. I have heard that Google, for example, has won U.S. NIST's benchmark language translation competition, over IBM's and other's ontological and mixed systems, with a relatively simple statistical NLP approach for at least two years in a row (2005 and 2006). With leadership, luck, resolve, and exponentially more powerful computing, text analytics, and comunications platforms, we might even be able to accelerate CI development to occur even earlier than the 2020 ETA. In addition to broadband and wireless acess for everyone, I suggest that may be one of the noblest challenges of our generation, in fact. It seems very likely that we will all soon widely recognize this as the Next Great Leap after the internet. Read on, and let us know if you agree. After you have finished this introduction, you may also enjoy a further CI exploration that adds valuable historical context, Promontory Point Revisited: The Transcontinental Railroad and the Conversational Interface
|
|
|
|
A Question of Priorities in a World of Accelerating Computation Annually for the last six years, John Brockman's excellent and provocative World Question Center at Edge.org has posed an interesting question to roughly 100 deep-thinking futurists who are committed to integrating both scientific and humanist perspectives on the world. In 2002, they answered a fictional letter from President G.W. Bush which asked each of them, as the administration's new science advisor, "What are the pressing scientific issues for the nation and the world, and what is your advice on how I can begin to deal with them?" As a developmental futurist, one who expects that a special subset of future events are statistically inevitable and highly predictable, I drafted my own response (though not requested to do so), "The CI Network," below. It proposes my own choice for the primary scientific-technological policy priority for the first world at the present time. If the CI Network is in fact an inevitable developmental event, this priority persists whether we consciously recognize and guide its evolutionary development, or unconsciously work to fulfill its emergence in a less foresighted and less presently beneficial fashion.
Clearly the keyboard is a primitive, first-generation interface to our personal computational machines. It gives us information, but not symbiosis. We humans don't twiddle our fingers at each other when we exchange information. We primarily talk, and use a rich repertoire of emotional and body language in simultaneous, noninterfering channels. We also use our hands to help each other and to manipulate objects ever since the first stone was thrown by the first hominid, so it is also clear that keyboards won't disappear until the human form itself disappears. In other words, talking is the highest, most natural, and most inclusive form of human communication, and soon our computers everywhere will allow us to interface with them in this new computational domain. I suggest that achieving this emergence will be the great technological "Moon Shot" of our generation, whether we presently realize it or not. The CI Network: A National Priority for Our Generation Science and technology, and broadly, local computation, appear to be asymptotically accelerating, universally-driven phenomena. We are now coming to understand that humanity does not control this developmental process, but rather selectively catalyzes it, ideally with ever increasing foresight. The entire 20th century demonstrated an astounding, unrelenting, unprecedented double exponential growth in the price performance of our computational machines. At the same time, we have seen new levels of computational autonomy, or human-independence emerge, wherein a rapidly diminishing fraction of human effort is required to produce any fixed amount of computational complexity within each new computing system. These are apparently universal developmental trends, not architected by human design or even desire. For the last decade at least, increasingly evolutionary and biologically inspired forms of computation have become the leading edge of technological development, and will remain so for the foreseeable future. Referring to the difficulty of technology prediction, Bill Gates reportedly once said "find me the person who predicted the internet, and we'll make him king." This is congruent with a common myth that futurists missed this major development, and certainly many did. The first major "think tank" long range public futures project of the postwar era, The Year 2000: A Framework for Speculation on the Next Thirty-Three Years, Kahn and Wiener, 1967, certainly missed the decentralization trend, though they did see computing continuing to accelerate. But that only shows the riskiness of relying on one forecasting group to understand the future. Every community has its own biases. Among the global community there were numerous visionaries who foresaw various pieces of the internet long before it emerged. In 1937, H.G. Wells in "World Brain," articulated the developmental inevitability of a rapidly updating compendium of total world knowledge. In 1945, in "As We May Think," Vannevar Bush proposed the Memex, a proto-hypertext microfiche network that would organize and distribute the world's knowledge, and noted "The advanced arithmetical machines of the future will be electrical in nature, and they will perform at 100 times present [electromechanical relay computer] speeds." In 1946, a year into the modern television era, Will F. Jenkins (aka Murray Leinster) in "A Logic Named Joe," predicted "logics," televisions with attached keyboards all networked by a switching innovation called the "Carson Circuit," that would be used to watch TV, make video phone calls, send and receive telegraphic messages (email), get weather reports, ask research questions, keep books, trade stocks, and play games. The emergence of personal computers was repeatedly predicted by journalists and commentators in the 1950's, and were a longtime goal of electronic hobbyists, who were making successively more complicated home built electronic systems. Peter Drucker predicted our 1980's economic shift to the information economy in the revolutionary Age of Discontinuity, 1968. Alvin Toffler expanded on this and the coming network of "electronic cottages" we would see in the 1990's in The Third Wave, 1980. In other words, there were many harbingers of the internet for those willing to look, and those who realized that trends in miniaturization, computing, and communication would have to continue to accelerate, because they weren't evolutionary choices as much as developmental trajectories for modern society. Because most change that occurs in the universe is evolutionary, prediction is generally quite difficult, particularly for those who don't discriminate between evolutionary and developmental dynamics. But developmental processes, when they can be discerned, are surprisingly easy to predict. In the language of complexity studies, they are 'standard attractors', like the hole at the bottom of a basin, or fitness landscape. You cannot predict exactly how the "marble" (the system, evolution, us) is going to get to the bottom of the basin, that is an evolutionary uncertainty, but you know all the evolutionary marbles in the system go through one of the few developmental "holes" available. There are two fundamental processes of change in universal systems: evolution, and development. The coming CI appears to be, by all present observations, a developmental emergence, and we can even measure it's progress, speculate on its enabling and inhibiting factors, and even predict its arrival from past progress, should we choose to do so. The CI network will not replace the keyboard, as some futurists have incorrectly claimed. For those who today have the education and resources to learn them, keyboards are sophisticated extensions of human will into the physical space. They will continue to increase in prevalence and sophistication, and will be with us as long as we continue to have biological bodies with ten fingers. Nevertheless, at the same time we can expect that most human computer interaction will move beyond the keyboard, and the ease, power, universality, and sophistication of the CI network will make all our technologies embodied, egalitarian, and symbiotic as never before. It will be of great import to the human species to recognize the immense benefit of the CI network as a scientific and technological goal, and to claim it as soon as possible as one of our top national and multinational developmental priorities. It is the great "Moon Shot" of our era, though one with far greater immediate and global practical payoff on its arrival. No other technological or scientific goal will be as remotely transformative or beneficial to humanity within our present generation, and we should measure and revise our approach toward it on a quarterly basis until it arrives. To not do so only perpetuates a condition of collective ignorance of the developmental forces currently at work in technological systems on Earth, and of our own great potential to hasten the arrival of a more self-balancing, self-modelling, computation-rich mode of human existence. We are now beginning to appreciate that the coming CI network is not so much an engineering choice as much as it is a gathering storm, an emerging developmental tidal wave of computational advance in our civilization that we can only accelerate or delay (the choice is ours), but never prevent from arriving, even in our most catastrophic and dystopian scenarios. In an evolutionary developmental universe, many evolutionary paths are within our control, challenging us to be good stewards and navigators, but some developmental destinations, such as accelerating local computation, are apparently not, challenging us to be good cartographers, prioritizers, and students of physical dynamics. This phenomenon of continual acceleration, also known as acceleration or singularity studies, is in need of much greater scientific attention. Meanwhile, a new passion for deeper answers, and an accelerating global compassion are all emerging today within technologically advanced cultures, and with that, we can hope for a new more humble, more thankful, and more grounded sense of our own role in the great story of existence. Constructing the CI network, with all the advances in artificial intelligence, computer hardware and software, simulation, bandwidth, miniaturization, manufacturing, engineering, and secondarily, political, economic, and cultural changes that will be needed to facilitate its spread, is the apparent dominant priority for the first world countries of our generation, whether we realize this consciously or not. It appears to be a necessary next step in the universal teleology, or purpose, of local computational development and of human-machine, biological-technological symbiosis, a process set in motion when the first stone tool was taken up in the hunter's hand. The CI emergence that is presently stirring within a small subset of our planet's increasingly organic technological substrate will be as important, on the developmental hierarchy, as the emergence of vocal language was some 2 million years ago for homo sapiens, who are themselves a small subset of Earth's tool-using species, such as pongids (apes and humans), cetaceans (dolphins and whales), birds, and even octopi. All these organisms have engaged in memetic evolution (cultural, gestural, and complex behavioral variation) for at least 14-20 million years, but only language has allowed our complex self-awareness and rich human culture to emerge. In the same fashion, giving our technologies voice, allowing them to talk both to us and to each other, represents perhaps the greatest global public good attainable in in the Information Age. We will almost certainly see the CI network's emergence within our own lifetime. Perhaps the most important remaining questions are how soon, how balanced, and how humanizing will be the path we take toward it.
Fifty years ago, the advent of digital computers moved us from the Industrial Age to the Information Age. But the information age is now getting on in years, and we will soon need a new phrase to capture the meaning of a coming environment where the average human interaction with the average computer is not via keyboard, but by voice. I and others have suggested that the Symbiotic Age is the most appropriate term for this coming era, as it will describe the dominant zeitgeist of the experiencea time when human beings finally feel both significantly empowered by and inseparably connected to their technological infrastructure. A time when anyone on the planet who is comfortable with talking will be cheaply and intuitively connected to the machines around them, when we will start thinking of and talking to our machines as physical entities, flexible to our needs, when complaints and compliments that we have will be relayed to appropriate parties, when the user's vocalizations will be an integral, and eventually, dominant part of the utility of our technology. Circa 2015-2025, several forecasters expect our natural language processing (most difficult), bandwidth accessibility (less difficult), human simulation and language translation software (even less difficult), and voice recognition software (already here) to each be finally sufficiently powerful, affordable, and ubiquitous that a new type of interface will emerge. Around this time, the majority of human-computer interaction on our global computer network, however we choose to measure it, will shift to a new level of sophistication. That new level will be the move from our present simple keyboard- and mouse-driven, primarily graphical user interface (GUI, or "gooey") into a graphically based, sophisticated mouse and keyboard (including virtual keyboard) utilizing, but primarily conversational interface (CI). A somewhat different definition of the CI's arrival involves that point where the majority of code and hardware behind the average computational interface is designed to interpret human language and intent. This is the most difficult interface problem presently known, more difficult even than constructing realistic virtual graphical environments, where great and accelerating commercial success (e.g., video games) has occurred over the last decade. CI-era network machines, tools, and services, including educational services, will not simply exist to serve us web databases and graphics, as they do today, but to empower a vast range of intelligent, linguistically guided human-computer interactions, in an organic technological environment where the average human command to the network is delivered verbally, not physically (as via punch card, keyboard, mouse or other physical input device). Think of the opportunities for human development! There are so many new skills, empowerments, services, and products that will evolve from this new capacity that we may rightly consider its full benefit difficult to imagine.
In the ubiquitous, mass-affordable CI environment circa 2020, our cellphones, computers, buildings, tools, and websites will finally achieve John Sculley's 1980's "Knowledge Navigator" vision, becoming symbiotic semi-intelligent agents that do ever more helpful intellectual tasks for us in the networked world. The continued development of better "top-down" computing standards, such as Tim Berners-Lee's/ W3C's semantic web, will be a part of this process. But the major part is likely to be evolutionary developmental and "bottom-up," like Microsoft's MindNet project, involving the integration of ever-smarter artificial neural networks, or their biologically-inspired equivalents, into the "back end" systems running all the tools and technologies we use. Even today, users of early CI systems (directory assistance, flight reservations, etc.) increasingly look forward to each hassle-free upgrade of the back end. Compare this with the mixed feelings we have toward user-guided upgrade processes, and the emerging human-machine symbiosis becomes tangible.We will all play a part, unwitting or not, as this drama unfolds in these final years of "unnatural interface."
On Phase Change Singularities: The Nature and Timing of CI Emergence Circa 2020, I expect a highly useful set of CI-equipped interfaces, built on top of an increasingly parallel but still weakly biologically-inspired set of computer architectures, to begin to emerge. I suggest the CI must be a preliminary step before high-level machine intelligence. Therefore, understanding and measuring the process of CI emergence may give us insight into the dynamics of the technological singularity (generally human-surpassing machine intelligence) to follow. Why can't some gifted and motivated individual, or perhaps a massive team of individuals (say, Microsoft Research) create an adequate CI using mostly top-down rationally guided design, working in relative isolation from the rest of the communications activity of the planet? Such systems have been tried many times before, and they have predictably been far less valuable than their designers expect. Instead, a much more distributed system transition may be necessary for the CI to emerge. How must the "Symbiotic Era" of the CI emerge? I'd expect through a massively distributed computational system that records and analyzes the entire human conversation space, in all the major spheres of human interest and experience. Furthermore, this system must first conduct multifold creative evolutionary experiments to attempt to construct meaning from this conversation, and in this process a small developmental subset of highly useful natural language processing systems will be created. The emergence of these systems will not be sudden or isolated, but incremental and global, as they are guided and pruned over many years of continuous human conversation with them all across the planet. How far are we away from being able to create the next generation of such a system? I suggest you watch the development of the current generation, the planetary internet, to search for signs of the CI's emergence. Today, roughly seventy percent of the 200 million daily verbal queries that Google (the most popular search engine on the planet) receives are novel. Urs Hölzle at Google wrote a thirty day user query cache circa 2000 but it was not useful, much to the surprise of the company. Too many of the queries are new and unpredictable to the system. When leading search engines begin to cache and do natural language processing on their user queries because most of them are repeating (circa 2015? 2020?), we will know that the human linguistic space has started to become canalized (a well-explored and frequently repeating phase space). Soon the entire human preference set as expressed in written language, will be, to a first approximation, cataloged and monitored in real-time by our distributed network of computing machines. This is a form of effective computational closure, a necessary precondition for a phase transition singularity (the CI emergence) to occur. Presently there are at least two problems preventing this evolutionary developmental emergence. The first is that there is not enough memory available to the cache (not simply the last thirty days, but something approaching the entire written history of human inquiry needs to be cacheable by our technological systems, something we can't expect for another decade or two). The second is that there probably is not yet enough global users on the system. Google's 200 million queries/day in 2003 were generated by only a few hundred million regular computer users. As responsible globalization advocates remind us, it is probably safe to say that these users are not yet sufficiently representative of the full interests and inquiries of the six billion people presently on the planet. Fortunately, both of these problems will change profoundly in the next two decades. You may have heard that Microsoft is recently launching a major new search software development initiative. This will be critical to the long run economic success of the company, because verbally-driven search is the first generation conversation of humans with machines. It is within the search space that the intelligent internet, and the next generation CI-based operating systems will emerge. Windows 2020 (perhaps better renamed Conversations 2020) will have to be built on such a platform, or Google-like systems will outcompete them for average human use. Google is becoming a truly unique distributed data processing platform ("GooOS"), and may well in coming years encode a full-featured operating system as an afterthought, rebuilding Windows functionality in a linguistically-driven Google language. Microsoft will have to match Google's distributed CI-based functionality in coming years. To not do so would be to risk being late to the next major reinvention of the planetary computing platform. As futurists Mark Finnern and Wayne Radinsky both note, Rich Skrenta's excellent post "The Secret Source of Google's Power," relates that Google's competitive advantage springs from the features of its hardware network. It is developing a distributed computing platform that, in 2004, "can manage web-scale datasets on 100,000 node server clusters. It includes a petabyte, distributed, fault tolerant filesystem, distributed RPC code, perhaps including network shared memory and process migration. And a datacenter management system which lets a handful of ops engineers effectively run 100,000 servers." That's some impressive automation. Another way to understand the emergence of the internet-based CI is to explore the historical developmental phases of web search technology: 1) The first wave of web search was created by cheap disk drives (Altavista won this war) 2) The second wave has been created by cheap CPU's and Beowulf cluster networks (Google won this war) 3) the third wave, cheap RAM, may be the next inevitable emergence. Cheap RAM will make both massive fast caches and new, far more complex CI-based algorithms possible. Who will win the third wave? That is an open question, at present. All of this is not to suggest that human learning will decrease as we approach and enter the CI-enabled Symbiotic Era. To the contrary, our learning will clearly be shifted to a whole new level as all kinds of new collaborative and creative opportunities emerge in CI-driven real and virtual space. I expect the CI to enable our currently laughable digital avatars (digital persona, or "digital me" (DM)) to become both increasingly accurate reflections of the sum of our aspirations (e.g., Lifelogs) and increasingly effective coaches and couselors of our higher selves. Post 2020, I expect my DM will begin doing things in virtual space that are amazing by comparison to what I am doing in physical space. For a fun fictional account of the Symbiotic Era, see my teen-oriented essay "Future Heroes 2035: My Friends and I." It helps to realize that the biological portion of human species activity is sharply limited by the fixed number of us (6 billion) and fixed speed (200 miles/hour) of communication within biological brains. From the perspective of computers that are growing their capacity exponentially, and learning and recording million times faster than our own brains do (on a range of measures), the entire human phase space appears essentially frozen in spacetime. Capture, closure, and convergence between humans and our digital extensions are the dominant features we can expect, from our perspective. (Things look much more exploratory and dynamic from the perspective of the machines). In the Symbiotic Era, a time when higher machine intelligence can exist only as a first-level reflection of human aspirations, the most important feature, for planetary intelligence, will be the "Intelligence Amplification" (IA) that increasingly powerful, CI-equipped systems provide to humans who are using them everywhere. I'm presently assuming this era will comprise 30 years, from 2020-2050. That would place the arrival of the next singularity, the Autonomy Era, circa 2050. The latter era would involve the emergence of initially simplistic but eventually strongly biologically-inspired self-improving systems. Because I think such systems must gain their self-awareness through a process of personality capture and co-evolution with human beings, I expect they too will require a nontrivial length of time, in human years, to develop truly complex personalities. This might take perhaps ten human intelligence years, though this would represent a far longer stretch of time in machine intelligence years. That progression would fit with a circa 2060 technological singularity (a developmental "phase change" involving the emergence of human surpassing technological intelligence). Intelligence amplification (I.A.) systems like the CI network are highly collectivist in their construction, and will be tested and refined by an entire planet's worth of users. They are also highly symbiotic, and will engage in extensive profiling, simulation, human factors internalization, and "personality capture" of their users' behaviors, habits, goals, limits, and rational and emotive states. As our second-generation, post-2020 CI systems increasingly use personality capture techniques to build sophisticated world-models of their users, we may expect that an increasing number of human beings will give their semi-intelligent agents access to and influence over their mind states in an always reversible but progressively more intimate manner. Most accurately, this should be considered a true first-generation form of "uploading," predating the more intensive and invasive forms of uploading that will occur in subsequent iterations of the symbiosis. Will you choose to let your 2030 machines cheer you up, advise you on your interaction style, or tell you when to take a work break? Today there are robotic toys (AIBO, smart dolls) that already program their users to provide specific emotional responses. The tremendous utility, comfort, and productivity of CI's utilizing personality capture and more specialized tools such as knowledge management (a first-generation electronic forebrain) will compellingly demonstrate to millions of modern skeptics that the developmental destination of human-machine interaction is not some dystopian scenario of computer domination or isolation, but instead an increasingly seamless and symbiotic convergence.
After the Symbiotic Age: Speculations on Autonomy and Beyond It may be approximately and usefully true, as hierarchical acceleration models like the developmental spiral propose, that the Scientific Age lasted for 380 years (1490-1770), followed by an Industrial Age for 180 years (1770-1950), and that today's Information Age will last for only 70 years (1950-2020). If so, then we can expect the coming Symbiotic Age, driven by CI network advances in natural language processing, connectionist architectures, bandwidth, simulation, and general hardware and software development, to last approximately 30 years (2020-2050), further continuing the relentless MEST (matter, energy, space, and time) compression of computation that appears to characterize a coming developmental singularity, should the structure of spacetime allow this continued acceleration to the Planck scale. This future history roughly argues that the Symbiotic Age may usher in a greater amount of scientific, technological, and socioeconomic change than that seen in all previous human eras combined. Or it may not, if the total level of change begins to saturate once it reaches a threshold of local complexity (a separate topic, perhaps best reserved for a later discussion). More clearly obvious is that each era, whatever its total contribution to the change function, seems to run less than half the length of the previous one, representing a true asymptotic function. There are at least two subtle points we might make here. First, note the value of discriminating change in to at least partially decoupled stages. If this model is correct, the next fundamentally new era--will occur in a thirty year period between 2020 and 2050. Ray Kurzweil proposes that change seen in the next 20 years will be equivalent to that seen in the last 200, and this perspective seems quite useful as a bird's eye view. But by choosing to additionally demarcate developmental stages, such as the Scientific, Industrial, and Information Ages, and the coming Symbiotic and Autonomy Ages, developmentalist models commit themselves to proposing an additional level of specificity to the curve of past and future change, one that allows for phases of apparent equilibrium prior to each new punctuated emergence. In other words, hierarchical developmental models almost always require that change proceeds in a pattern known as "punctuated equilibrium," brief bursts of new activity followed by longer plateaus of consolidation, and where the final years of one stage are always significantly slower than the early years of the next. This future history we have offered, then, commits us to an expectation that the next twenty years will be primarily a less-remarkable continuation of the groundwork begun fifty years ago, at the birth of the Information Age. Seen in retrospect then, the period between the emergence of the developmentally inevitable Internet and its next necessary offspring, the CI network, will not likely be considered, from the human perspective, as a period of increasingly dramatic leaps, but rather of many steady, smaller, and less noticable improvements, preparing us for the next great surge of technological change. Borrowing a popular phrase, we can say that the Symbiotic Age, our coming era of not just functionally but also linguistically adaptable physical and virtual machines, will require fifteen to twenty five additional years of slow going and hard work before it suddenly and surprisingly becomes an overnight success. As a second subtle point, consider the apparent dichotomy of the unprecedented scientific and technological acceleration presently seen in first world countries, and their decreasing rate of cultural change as they develop ever finer distinctions in permissible social, politicolegal, and economic behavior. Such deceleration of at least the magnitude of cultural change in the first world strongly suggests we are rapidly moving toward an end-stage, saturation phase of development in the human computational substrate. In other words, there's not much more social optimizing left that can be easily done by human beings running human societies. As Francis Fukuyama (The End of History, 1992) and others have observed, all the world's governments drift closer every year to a scientific, capitalistic, democratic final common pathway for social development. In this new world the average citizen, having a sharply finite capacity for change absorption, increasingly insulates their social and cultural consciousness from the developmental hurricanes occuring in the technological systems around them. Technological acceleration continues unabated, it just increasingly proceeds "under the hood" of the car of change, so to speak. Think about all the computation that went into the creation of an advanced hybrid automobile like the Toyota Prius, for example, and how oblivious we are to that, vs. car technology of the 1950's. So does this mean that the total amount of socioeconomic change must slow down in the Symbiotic Age? Hardly. We only need to lose our first world bias to understand the unprecedented nature of the changes to come. If billions of presently marginalized human beings are uplifted toward first world socioeconomic status in coming decades, to a place where only tens or hundreds of millions have gone before, this will still represent massively unprecedented socioeconomic change for the average distributed complexity of humanity on Earth. We can expect these changes, given current trends, even as first world culture becomes increasingly more canalized (comfortably settled) and politically correct with every passing year. This new post-CI growth spurt of globalization will be tempered somewhat by technological systems that allow us to increasingly preserve and maintain existing cultural histories, and with less fidelity, cultural differences. It will occur as human individual and cultural consciousnesses become steadily better at either insulating themselves from, or balancing themselves within, the accelerating computational change occuring all over the planet. Globalization debates are today often framed in terms of how to help the third world rise to first world standards. By mid century, they are likely to be framed in terms of how to help societies of every type become more change-seeking, versus change-averse, with regard to many powerful new opportunities for human-machine symbiosis. "Symbiozation," not globalization, sounds like the dominant cultural agenda, in addition to refining globalization, in an era of late 21C economic abundance, to be far more equitable and pluralistic, and in slowly demilitarizing a planet that may finally have sufficient transparency and trust to allow flatter and more bottom-up rather than top-down systems of global security. If we wish to be acceleration-aware in our forecasting, we may not be done yet. During the Symbiotic Age, the most successful of our CI's will likely incorporate substantial bottom-up-developed biologically inpired evolvable hardware (EHW) components, as well as a wide range of scanned and reverse-engineered structural architectural elements of metazoan neural networks within their differentiating body plans. Somewhere within this process they will begin to develop high-level, scalable, and robust ability to direct their own self-improvement, self-repair, and self-generation (e.g., limited self-replication, variation, and selection of even high-level neural architectures). This vision argues that we must take a highly distributed, incremental, and network-centric developmental route to the "Planetization" of humanity, that concept so eloquently envisioned by Teilhard de Chardin in 1945. (See "The Planetisation of Mankind" in The Future of Man). The strong claim that I wish to make is that, just as linguistic AI will require a planetary network of human beings, incrementally tuning up the conversational interface (CI) to human-level utility, so too it will require a planetary network of human beings tuning up all our robotic systems to produce a broad collection of utility robots that are be sufficiently situationally intelligent to interact in the human environment. This problem is not an easy one. Like the CI, it is extremely complex, and robots will be incredibly stupid for decades. Only the input and detailed, collective feedback of their user-gardeners will allow them to become less than stupid. Generally, I call this perspective the "5% Rule" and it's the idea that all major substrate transitions seem to be primarily "95%" bottom-up and experimental, and only slightly "5%" guided by top-down, hierarchical or developmental control. It suggests that the primary role of the average 21C human being, from technology's perspective, will be as a trainer and gardener of the growth of tomorrow's intelligent technologies, just as we today are socially constructing the wired and wireless participatory web. Perhaps we will choose to allow various forms of increasingly autonomous self-replication, with appropriate safeguards, because the adaptive machines they empower will be natural incremental extensions of the machine learning paradigms presently in existence, and because such extensions will demonstrate dramatically greater human utility, as well as ever more self-balancing and statistically safe behaviors in the vast majority of artificial selection environments. But perhaps most importantly, our increasing understanding of biological, cultural, and technological immune systems, systems that were poorly understood by early 21st century thinkers, will allow us proceed with growing wisdom. As we begin to collectively train our machines, we can and should expect that any local catastrophes that do occur (e.g., unpredictably behaving, unsafe learning agents) will be rigorously contained by any redundant, fault-tolerant, healthy immune architecture. We will come to realize that those micro-catastrophes that do occur, within healthy immune environments, can only catalyze immune learning, increasing the "average distributed complexity" of the system, as well as general system intelligence. This increasing informational immunity appears to be one of the great hidden mechanisms that has guaranteed the accelerating hierarchical emergence of computational substrates in universal history. Within a few short years of this new self-directing, self-replicating capacity, we will begin to suspect that our increasingly self-modelling tools and agentsand by association, "we," as a human-machine social networkare a good deal more intelligent than surface appearances indicate. We can call that next era, an age of increasingly self-directing symbiotic machine interfaces, the Autonomy Age. By comparison with previous ages, it may last as little as 10 years (perhaps 2050-2060), taking us to the edge of a circa 2060 technological singularity, in this set of guesstimates. The transition to full autonomy in our technological systems will occur when it is ready. It may be unfortunately or intelligently delayed, or conversely may be charitably hastened along, but it is ultimately an unstoppable natural transition for technological development. Nevertheless, the way this transition appears, to human beings, will likely be largely within the control of post-singularity A.I., and influenced by their ethical concerns, whatever those may be. Greg Stock (in Metaman, 1993) and others have written eloquently on this transition, but few have considered the requisite ethical constraints that must emerge within self-directing technological systems that have many orders of magnitude greater learning capacity than even our most cherished biological architectures. We may understand ethics as a form of behavioral immunity that protects an otherwise precarious accelerating intelligence development, and no known intelligent systems exist on Earth without the presence of a competent, healthy, overarching immune system. If you additionally suspect, as I do, that self-aware technological systems will quickly become a true superset of the biological space, containing all the elements of biology, plus many additional unseen capacities, including immunity and ethical capacities, then excellent arguments can be made that our "uploading" into the machine substrate will be inevitable, gradual, voluntary, desirable, and reversible (the latter at least in principle, if rarely in action) when viewed from the perspective of our present unmodified biological minds. Considerations of developmental ethics in complex systems are among a number of important outstanding questions in need of ongoing careful study, and are a priority for our organization. Expect much more on these topics in coming years from our emerging sciences of simulation.
|