Singularity Watch
HomeNewsletterReading GroupsConferencesPublications"Singularity Studies" LinksDegree ProgramsCritiques

The Conversational Interface: Our Next Great Leap Forward
(aka Conversational User Interface, Linguistic UI, Natural UI, Spoken Dialog System, etc.)

© 2003-2014, John M. Smart. Reproduction, review and quotation encouraged with attribution.

Outline

The Conversational Interface

The Digital Twin (DT, or "Twin"): Your Emerging Digital Self

Predicting the CI Emergence

 

The CI Network: An Unrecognized Global Priority for Our Generation

More CI Details

 

 

On Phase Change Singularities: The Nature of CI Emergence

After the Symbiotic Age: Speculations on Autonomy and Beyond

The Conversational Interface: The Trigger to Perhaps the Biggest Set of Social-Technological Changes Today's Adults Are Likely to See in Our Lifetime

This essay, mostly written in 2003, predicts the near-term (2012 to 2019) emergence of a Conversational Interface (CI) on the global, mobile, and wearable web. The CI has also been called a Spoken Dialog System (SDS), Conversational User Interface (CUI), Linguistic User Interface (LUI), Voice User Interface (VUI), Natural User Interface (NUI) and other terms.

For a great easy-to-read book on the machine learning advances that will soon deliver the CI to internet-enabled devices everywhere in the world, I highly recommend Eric Siegel's Predictive Analytics, 2013. If you are short on time just read Chapter 6, which describes the way IBM's Watson question answering computer beat the two best human Jeopardy! quiz show champions in the world in February 2011, answering 90% of the questions correctly, when earlier versions of IBM machines, competing in the annual TREC conferences, could answer only 15% of such questions correctly in 2006. Here is a nice 10 min video, IBM Watson: Final Jeopardy! and the Future of Watson, 2011, that summarizes that historic event and looks a little bit into Watson's next applications.

The Jeopardy! contest was a credibility watershed for the CI, as it convinced many previous skeptics in the computer science community that natural language understanding was solvable by straightforward statistical and ensemble techniques, approaches which scale quite well as our online database sizes and computing power continue their exponential growth.

As many have written since, it now seems reasonable to expect that by the end of this decade we can expect Watson-like abilities on our wearable smartphones, and a new paradigm of teacherless education emerging for all the world's youth and adults. I believe the social, economic, and political implications of this emergence will be without parallel in human history, and I've sketched just a few of them below.

For a few technical books on the CI, see Spoken Dialog Technology: Toward the Conversational User Interface, Michael McTear, 2004, and Practical Spoken Dialog Systems, Deborah Dahl (Ed), 2005, Speech and Language Processing, Daniel Jurafsky and James Martin, 2008, and Ensemble Methods in Data Mining, Giovanni Seni and John Elder, 2010. Some recent general readership articles on the CI: Where No Search Engine Has Gone Before, Farhad Manjoo, Slate, 2013. (Let me know if you find others I should list here).

As the primary interface to our species accelerating digital storehouse of knowledge, the Conversational Interface seems very likely to be the most important enabling information technology development and collective intelligence advance on our planet in the next thirty years. Though the CI will have no self-awareness and very limited self-modeling, it will have a rapidly increasing contextual (embodied, situated) intelligence, as it will be 'programmed by the planet' of human users.

The CI will initially be used, as with current information technology, to foster a variety of new narcissisms, addictions, and dependency behaviors. First generation cellphones are used irresponsibly by drivers, as the phones aren't yet smart enough to prevent user abuse (texting while driving, etc.). First generation video games make kids less aware of and competent in the physical and social world, as the great serious games, global games, and wearable games haven't yet emerged, etc. Yet even the first generation CI will also deliver unprecedented new problem solving and educational capacities for those cultures, organizations and individuals who are motivated to use it for positive change, as we'll describe below.

With its biologically-inspired connectionist algorithms and parallel, ensemble-based architecture, the CI is a form of artificial intelligence (AI). Microsoft began working on the AI behind CI's in earnest with the founding of Microsoft Research in 1991. But the folks at Google, just five years old when this article was first written, advanced much faster and farther in this area, because they were experimenting in the right area, web search platforms. Google now seems to have an unbeatable functional and scale advantage with their leadership in globally distributed search and archiving, a strategy which will allow a conversational Google OS to emerge by the end of this decade, and get rapidly more intelligent every year thereafter. [Nov 2008 Update: See the Google Mobile App Voice for the iPhone and the new Google Search Wiki, two big new steps toward the CI. Aug 2013 Update: See tech journalist Robert Scoble's forward-looking review of Moto X, Google's smartphone with a dedicated processor that listens 24/7 to hear three words, "OK, Google Now" to do language processing for commands to Google Now, their voice user interface and intelligent personal assistant equivalent of Apple's Siri].

As we will argue, the CI and its extensions, driven today most centrally by Google (and of course by all of us, using the world's leading search platforms), now seem very likely to have a global economic and technical productivity impact in the 21st century that will greatly exceed both the emergence of individual and networked computers eras in the 20th century. Technology foresight matters, now more than ever before. When Microsoft failed to recognize in the early 1990's that the CI platform would be far more likely to emerge in an incremental, statistical, bottom-up fashion via superior connectivity and organization of the worlds digital information via search platforms, rather than by some theory-driven top down design, they permanently lost their chance to sit at the center of cyclone of the next generation of software systems for social value creation. They became another victim of the amazingly accelerative, chaotic, and creatively destructive forces of accelerating technological change.

When a cheap, ubiquitous CI and its high bandwidth network and simulation infrastructure arrives (2015? 2020? 2030?, the exact date seems much more a matter of social, economic, political and technical choice than destiny), it will move us out of the Information Age into a fundamentally new era, one that has been called the Symbiotic Age by some futurists. This will be a time when all human beings on our planet, including the currently disenfranchised, functionally illiterate, and marginalized "bottom three billion," will be able to converse meaningfully with ubiquitous and semi-intelligent technological systems, and use them daily to solve a vast range of computationally trivial but very real human problems. A time when all of us feel truly symbiotic with our digital appendages. Our digital systems will very likely not be complex enough to be considered 'organic' within this timeframe, but they will feel like natural, indispensible extensions to our organic selves.

Most obviously, the CI will help us address the current global inequity of access to high quality, lifelong education in our increasingly technological world. And once the digital and education divides have fallen globally (eg, effectively 'unlimited' open source technological education becomes available to any human in education-permissive societies), the economic and political/ power/ equity divides can be expected to fall (or, as social responsibility advocates like to say, move from our currently transitional/unsustainable distribution, toward more "rationalized" or "sustainable" distributions) within a generation or two of the CI's emergence.

There is also good early evidence that CIs will help us discover better collective solutions in governance, globalization, environment, security, health, and productivity, among other domains, and allow us to extract insight and knowledge from all the burgeoning data being collected in our increasingly transparent and quantified society. The post-CI world will be an amazing era to be alive, even while it is an era that is still several decades away from the so-called 'technological singularity,' the arrival of generally human-surpassing artificial intelligence in our most advanced computing machines.

A functional CI network will not entail significant machine self-awareness, but is a transitional stage of advanced natural language processing (NLP), a field that deserves far greater funding and attention than it attracts today. NLP advances will combine with critically-needed improvements in bandwidth of connectivity and the hardware and software of simulations, so that our CI devices and humanlike agents will "talk" both to each other and to us, using data-rich semantic protocols, with grammars, vocabularies and expressions that are continually 'tuned and pruned' by the daily interaction of hundreds of millions, then billions of humans with the system.

[2009 Note: Most importantly, we need to recognize that NLP advance is driven 95% by the availability of good, high-quality, semantically parsable data on the web, and the hardware to store it and networks to serve it, and only 5% by the effectiveness of any particular algorithms, software, or computational platforms. In other words, the CI will be primarily a bottom-up, data-driven, evolutionary emergence, and only slightly a top-down, engineered, developmental emergence. This "95% evolutionary vs 5% developmental" emergence pattern seems to be central to many complex systems emergences, as I argue in my 2008 paper on evolutionary developmental processes of change. A quick Google search (using near-conversational structure) shows there are others who share this critical perspective. As Yorik Wilks says in David Levy's Robots Unlimited, 2006, "Artificial Intelligence is a little software and a lot of data." Google Researchers Alon Halevy, Peter Norvig, and Fernando Periera articulate this data-driven strategy for AI creation quite eloquently in The Unreasonable Effectiveness of Data, IEEE Intelligent Systems, 2009. This paper goes a long way to describing how the semantic web is actually emerging, and shows how what we programmers think we are doing to make it emerge is usually a lot less useful than finding better ways to accelerate the creation of big data, to better store, access, and connect it, to create the digital environment within which our next-gen CI systems will necessarily emerge. For additional insights on architectural approaches to the next-generation CI, read Joe Colannino's 2009 Master's thesis (and slides) on using Statistically Improbable Phrases to automate semantic ontology creation. Joe says: "Statistically improbable phrases give, on average, ten times less information overload and more than double the confidence of retrieving relevant data as compared to Boolean keyword searching. If the Semantic Web is to become a reality, it will require automated algorithms such as TF x IDF operating on oodles of data."]

What will a CI enabled browser of 2015-2020 look like? For one thing, it seems clear that it will soon after include some sophisticated software simulations of human beings as part of the interface. Already today (2004), first world culture finally spends more on video games than movies, and this will apparently be a permanent feature of our world from this point forward. These "interactive motion pictures" are more compelling and educating, particularly to our youth, the fastest learning segment of our society, than any linear scripts, no matter how professionally produced. [2009 Note: It will also be very visual. Apparently youth under 12 in a few demographics now use YouTube, not Google, as their primary search engine, so they are visual searchers primarily, text searchers secondarily.]

Now imagine that we have begun talking to our computers in a crude but useful verbal exchange, a kind of 'pidgin grammar' circa 2015. It is clear that we will not simply want to talk to a disembodied machine. While some of us will be happy with simple graphic indicators telling us whether the machine understands us, many more of us will want to relate to virtual human beings, embodied agents that have the ability to nonverbally communicate, to frown or place their hand on their chin until they understand what we are telling them to do, to react with word by word with realistic microexpressions to our statements and questions, to smile when they detect we are smiling at their jokes, to act in calm and relaxing manner when they detect we are upset, to speak more rapidly we are bored or hurried, etc. Why? Because having a nonverbal quasiemotional communication channel operating in parallel with linguistic communication makes our words more efficient and effective. We may not want this for low priority and multi-tasking communication, but we surely will whenever quality and accuracy and enjoyment and perhaps even speed are important. Thus we can see how, over time, we will want our our CI-equipped virtual avatars to learn to model and display human emotion and body language. We'll want avatars for the same reason that Webkinz digitally animated stuffed animals, are so successful with young kids. We want to relate to our technology. We want our mirror neurons to fire, we want our empathy activated, we want to feel like our technology is humanizing, improving its behavior, becoming more intimately connected to us. [2009 Note: And as soon as it is relating to us in a conversational way, we'll want to ensure that it's becoming more moral, too. See Moral Machines, Wendell Wallach and Colin Allen, 2009, for more on that fascinating insight.]

Given computer technology's far greater rates of innate learning by comparision to slow-switching biological information processors like us (on the order of ten millionfold greater, see Chaisson, Cosmic Evolution, 2001), it is clear that the computers will continue to adapt to and integrate with us, far far more than the other way round.

The Digital Twin ("Twin"): Your Emerging Digital Self

Once we have reasonably good conversational interfaces and semantic maps, circa 2015-2020 in my guesstimation, a major new developments we can expect at the same time are Digital Twins” (DTs or “Twins”), also called personal software agents (PSAs), intelligent assistants, agent avatars, software secretarys, etc. that will use these interfaces and maps to construct crude models of their user’s preferences and values. It's the user-modeling part that makes these agents into increasingly useful simulations of and aids for us as individuals. Twins will use as input user writings and archived email, realtime wearable smartphones (lifelogs), and verbal feedback, to allow increasingly intelligent and productive guidance of the user’s purchases, learning, communication, feedback, and even voting activities, offloading a lot of the information overload and cognitive overhead of managing modern society from biohumans to their twin. As I see it, the intelligence amplification that results from our having twins will begin a major revolution in protecting and furthering the user’s interests, leading us to a much more democratic society.

Twins will start out primitive, but they will quickly get good at filtering digital information streams for the user, answering simple questions, managing simple productivity tasks, and offering simple advice. Many people, walking in a supermarket or driving on the street, will reach past one brand of product, or drive past one type of store to another, guided there verbally or visually by their twin, who is continually using public data, user history, and algorithms to seek a better statistical match with their expressed values and preferences. Many companies will build twins, and we can expect the first generation of these to be given free to you by major companies and heavily manipulated by marketers, but it is also clear that companies with the best records of respecting privacy and empowering users will quickly become the most popular. Open source versions of twins will be built by who distrust the corporate versions, or wish to maximize user control. [2013 Note: Decentralizing technologies, like Twins, that empower people to both make better personal and collective decisions, to maximize their future freedom of action, seem very likely to increasingly dominate over processes used to maintain traditional hierarchies, in all market economies in coming decades. Decentralizing processes break down old hierarchies and allow new, better adapted ones to reform in their place. Consider the great flattening of corporate hierarchies that occurred with the introduction of the personal computer in the 1980's. Consider also the way Comcast's Xfinity and other tired monopolies will soon be disrupted by the arrival of true internet television as soon as we all get near-gigabit broadband, as I've written elsewhere.] Many people will allow their twin’s record of their preferences to be at least partly public, so that other individuals, companies, and groups that share their preferences can easily find them. Sharing and modifying preference files between twins will be automated or manual, as desired. Ads will be more personalized, and useful, than ever before.

Companies and governments that don't respect user preferences will increasingly be constrained by the shifts in consumption patterns and in initiative politics that will be coordinated by networks of individuals, using their twins to keep track of what kinds of purchases and votes and initiatives will best support their values. This is a level of invididual empowerment that the web has always promised, and which will finally be delivered once we cross a threshold of intelligence amplification for the individual voter. Why will many, and eventually most of us run twins? Because we want a smart and highly personalized extension of our own memories and desires, one that will increasingly represent and motivate us, and that can increasingly act in more uniquely differentiated and creative ways.

Twins will guide us to purchase from the most socially responsible, innovative, and consumer responsive of the corporations, and to back the most democractically advantageous initiatives in politics, chaining our corporations and the governments they have captured to a virtuous cycle. We have allowed our corporations in particular to concentrate wealth and power vs. all other actors for roughly the last century, and to prioritize growth over the common good. They have been aided in this slide to inequality and plutocracy by our first generation of communications, media and computing technologies, which are mostly "one to many" (hierarchical). Voters will increasingly get their power back in the "many to many" media world now emerging, but the greatest advance will be twins, as they will function as electronic augmentations of our own power in democratic states.

The rise of twins will shift the tide back to a more representative democracy, with less crony capitalism and greater economic equity (see Daron Acemoglu's, Why Nations Fail, for more on the costs of weak instititutions and too large an internal rich-poor divide). In America, we've had increasing rich-poor divide since the early 1960's, so if the reversal starts to happen in the 2020's, as I suspect, it will have been 80 years coming. Political and economic activities will likely still be corrupt at the very top, and with the very rich. See Lee Kuan Yew's, From Third World to First: The Singapore Story, 2000, for a great (and self-deprecating) story about how he was able to eliminate corruption at the mid-levels but had no power to eliminate it at the top in his modernization of Singapore. But as long as most of the system works for most people, our incredible record of scientific and technological acceleration will continue. All the world's powerful actors will increasingly be constrained in a global democratic and transparency cage, and we will move even more quickly toward a postbiological and far more ethical world.

I expect we will see such a world by the end of this century, so we may not have long to wait before we permanently and irreversibly leave the era of unaugmented biological humans running politics on Earth. Increasingly postbiological intelligence will very soon emerge on Earth, whether we want it to or not, and that's going to be a very different world. For some thoughts on what that world will look like, I recommend Robert Wright's Nonzero, 2001, Ray Kurzweil's The Singularity is Near, 2006, Wendell Wallach's Moral Machines, 2010, and Bowle's and Gintis's A Cooperative Species, 2011. Books like these make clear that humans strive for accelerating positive-sum returns and the common good in general, and that money and power is corrupting mainly at the top, less so among the many. Furthermore, if our best AIs are going to be built by starting with scanned human mental patterns and computational neuroscience, as many of us believe is the fastest and most reliable path to AI, then they will start out with our level of morality as a base. And if morality is a function of individual and social complexity at playing positive sum games, as I believe it is, our intelligent machines will rapidly exceed us in their moral capacities and behavior. There are a lot of "ifs" in this scenario, but I challenge you come up with one that better accounts for humanity's record of ever accelerating complexification, in an universe that seems biased to growing the leading edge of planetary intelligence from physics to chemistry to biology to biominds to technominds over time.

Additional scenarios for the future of twins can be found in my 2010 video, The Digital Self, and in philosopher Eric Steinhart's Survival as a Digital Ghost, Minds and Machines, 2007, 17:261-271. In particular we must understand the rationale for some of us to want a twin, as opposed to just a very good butler or servant, with its own separate personality, as many of us will choose butlers instead, at least at first. But as systems theorists Roger Conant and W. Ross Ashby argued in 1970, every good regulator of a system must be a model of that system. As our digital twins become better and better regulators of our biological selves, regardless of their initial personalities, they must become increasingly better models, extensions, and twins of ourselves.

At some point, we may learn how to merge even our higher thought with them, via brain-computer interfaces, and should that occur we would consider them an indistinguishable part of us. At that point, when our biological bodies die, this should subjectively feel, to our digital-biological hybrid self, simply like further growth and change, not death. This is speculative of course, and if you want more on that, visit the last section of this article below.

In the 1980's, technology futurist George Gilder talked eloquently about the Microcosm, the explosion/new environment/universe of inexpensive microprocessing power, which began in the 1960's, and ushered in the personal computer. In the 1990's he talked about the Telecosm, the explosion of inexpensive telecommunications via fiber optics and network technologies, which began in the late 1980's and ushered in advanced new forms of globalization. Futurist Bruce Sterling, in Shaping Things, and technologists Chris Stakutis and John G. Webster in Inescapable Data, have each talked about the Datacosm, the explosion of unstructured data on the web, which began in the late 1990's and has led us to fantastic new automated structuring tools like Google, and new data mining and competitive intelligence platforms.

In the early 2000's I began thinking about next steps in this hierarchy, and became interested in something I call the Valuecosm, the explosion of structured public and private maps, data sets, and statistical models of human preferences and values. We can think of the valuecosm as an element of the Semantic Web, that eloquent vision of Tim Berners-Lee, but focused most specifically on human values and preferences in a broad variety of contexts, and graph theory, Bayesian, and other models comparing those values quantitatively and qualitatively to others in the values space.

In concert with digital twins as our interface to the digital world, the emerging valuecosm will help us grow avatars that act and transact progressively better for us every day, will lead us to dramatically better discovery of potential positive-sum social interactions, to better and more distributed social network media and education, to great new subcultural diversity, and and ultimately, to new ways to hold powerful actors accountable to democratic values.

In this way, as our digital twins begin to approach human level sophistication later this century, we will use our them to look after our values and advise us on our votes, purchases, and collaborative behaviors ever more powerfully, and thereby usher in a new level of global accountability of corporations, institutions, governments, and other large actors to human rights and democratic values. This will be the first generation of an era of total systems quantification, of both abstract and concrete issues of human value, to use futurist Alvis Brigis's excellent phrase, and perhaps the first advanced version of the digital democracy vision. See The Valuecosm, 2004 for more of these longer-term arguments, if interested.

Once we have reasonably good values maps on the web, and a reasonably advanced twins, able to scour the web for us while we are asleep, to act as our message and media screener and butler while we are awake, etc., imagine the positive implications for:

  • Subculture diversity and representation (great new experimentation in victimless variety)
  • Global communication and collaboration (no language barrier)
  • Global digital divide (nearly disappears)
  • Accountability of powerful actors (automated lobby twins for every group with votes and values maps)

As I've argued with my tongue-in-cheek Fourth Law of Technology, we must also expect, and try in advance to minimize, all kinds of first-generation problems with these technologies. Consider for example some of the first-gen downsides and concerns digital twins and the valuecosm might bring to:

  • Data security and privacy
  • Crime and fraud
  • Predictive marketing and consumer behavior programming
  • Public relations manipulation
  • Echo chambers/cocoons that polarize and lose touch with external realities
  • Parenting (how early can kids have DTs?)

Getting past the dehumanizing effects of these disruptive technologies that are inevitable in their first generation, moving on to the neutral effects of the second and finally the positive effects of the third generation and beyond, will be major challenges for designers, early adopters, critics, investors, entrepreneurs, politicians, and the other key players in our multifaceted society.

Using Pareto's Law, I would grossly predict that 20% of us will end up using twins and the valuecosm for net personal empowerment, to take us to amazing new levels of innovation, and to just-as-amazing new levels of collective ethics and sustainability. In other words, 20% will of us will use these tools to be measurably better and more self-empowered than our parents were, on all the measures that matter to us.

At the same time, the other 80% of us may well choose to use these tools for new levels of fantasy, entertainment, distraction, and domestications. I don't think we have to worry so much about that, as long as we keep our citizens away from the worst of the new addictions and dependencies. As long as we don't let these citizens slide into ignorance, thereby creating an Idiocracy, or allow our use of these tools to cause structural violence, a term coined by futurist Johan Galtung.

In other words, as long as the 20% of folks who get the 80% of the work done in any society (Pareto's "Vital Few") are significantly empowered by these platforms, everyone else can take a long-deserved rest from millennia of toil, brutality, and hardship, for as long as they want to, in fact. So it's a "both/and", bimodal world we are headed toward. 20% of us will choose to get more empowered and 80% will likely choose more slack, entertainment, and distraction. The liesure society that emerged in the 20th century, so well articulated by futurist Herman Kahn in the 1960's, will continue its inexorable advance to new heights of comfort and domestication in the 21st century. Nevertheless, I'm quite convinced that we are not going to see an idiocracy emerge, at least for the next few decades. The 20% who are the opinion leaders and political and economic movers and shakers will be compelled to ensure that the 80% are free enough and educated well enough to be civically minded and personally responsible for some measure of their own social advancement. If they don't address this problem, the 80% will vote itself ever increasing financial and social entitlements every year outstripping the real wealth and productivity of the nation, and those democracies (think of Greece, Portugal, Spain, etc.) will quickly slip into financial insolvency, while the more technically productive and evidence-driven democracies (Scandinavia, Germany, Switzerland, Singapore, Taiwan, South Korea, etc.) will surge ahead as new leaders.

One could argue, and many have, that the United States, with its fifty years of rich-poor divide growth and increasingly poor K-12 education system, is in danger of becoming an idiocracy in the next generation or two. But the truth is, today's youth are far smarter than that, and the CI, teacherless education, and the symbiotic age will emerge far, far sooner than any real idiocracy could arrive in the U.S.

We can do our best to improve our very slow moving, inertia-bound political and economic systems, but if we are to learn the universe's lessons of accelerating change, we must roll up our sleeves and focus primarily on building the continually accelerating scientific and technological systems that will deliver a far more democratic values-driven and teacherless education world. Everything else but our science and technology, and our interface to them, is so slow moving and growing as to be nearly future-irrelevant, by comparison.

Predicting the CI Emergence

When can we expect the CI's emergence? In March 2005 Google's director of search Peter Norvig noted that their average query is now about 2.5 words per query, by comparison to 1.3 on Alta Vista in its heyday, circa 1998. In subsequent email conversation with him he has told me that the actual number is "closer to 2.6 or 2.7." This is an initial doubling time of only seven years, if this is a quasiexponential function.

It appears that the growth of the CI as a complex adaptive technological system is in the early phase of an S-curve, well before the inflection point, and thus its growth will continue to look exponential for some time to come. [September 2008 Update: Average query length to Google now exceeds 4 words, apparently just this month. This is more early evidence that this phase of search query length growth will remain exponential up to the inflection point.]

In my opinion this average search query length, averaged across all the leading search engines of the day (Google, Yahoo!, Bing, etc.) will be one of the key numbers to watch to gauge the growing effectiveness of statistical natural language processing (statistical NLP) in creating a conversational front end for the internet and all our other complex technologies in the 21st century.

A presentation from one of my technology foresight slide presentations below attempts to summarize this point:

How long might the emegence of a full-fledged CI take? That's a big guess today, but given that it has taken approximately seven years to double from 1.3 to 2.6 words per average query, we should expect another seven years, circa 2012, to get us to 5.2 words, a period I suspect would be just prior to emergent grammars and the feeling of CI intelligence (something "semismart" on the other end of the line) on the part of the user. If these proposals and qualifiers are approxmately correct, this would place the intelligent CI's emergence circa 2015-2020. [2008 Note: One could even argue that our very first generation (slow and very limited) CI has finally emerged with Google Voice on the iPhone in 2008.]

[2010 Note: The Nexus One, Google's new android phone, is another amazing step forward in pairing the emerging conversational interface with very humanly useful functions, such as turn-by-turn navigation. As you may recall, the replicants in one the greatest sci-fi films of the 20th century, Blade Runner, were Nexus Sixes, near-human androids, but with an engineered four-year lifespan. Kind of like phones, if you think about it.:) How long will it be before we are wearing a Nexus Six? How long before our Nexus DT is smart enough to pass a Turing/Voight-Kampff test as human? How long until your twin becomes a good representative of you? Think I'm joking? Don't bet against it!

When the Nexus One is combined with Siri, a powerful new NLP and AI plugin available for the iPhone today, and for the GPhone soon (according to the grapevine), the conversational interface will take another important step forward. Take a look at Siri, it is smarter than you probably suspect, and a great potential acquisition target for Google. Here's hoping they get acquired, we'll see. [2010 Note: We know what happened here, Apple, not Google, made the aquisition, so Google had to play catchup.]

English speakers appear to use an average of 8-14 words per written sentence and 5-11 words per spoken sentence (depending on context) when we ask each other complex questions. I would expect that as soon as our average search queries get up over eight words a sentence, if not before, we'll start to see and expect emergent "pidgin" grammars in our computer's responses. Since voice recognition and text to speech are alreadly largely solved (they are far easier problems to solve than NLP), we'll be speaking and listening to to those sentences. At that point, we'll begin to feel like our computers have a primitive conversational intelligence.

[2008 Note: Google's announced a voice recognition app for the iPhone this month. A NYT article on Google's Voice Technology says their statistical model (unique words and some of the ways they can be strung together to ask common questions) is composed of two trillion tokens (unique words and word combinations). Having this application available, even if it is used only for simple queries for the next few years, should easily push the average queries past four words, as it is far easier for the average person to speak a longer sentence than to type one. And Google Search Wiki, if it is used extensively, promises another advance toward buiilding a statistical map of words that should be strung together to answer human questions. Google is starting with just a personalized version of your search results, but they will clearly eventually release collectively aggregated and friend aggregated versions as well.]

[2005 Note: Already today, when you use "near" in a sentence on Google, as in: "Coffee shops near Palo Alto", which returns a Google Map (yay, Google's now got an optical cortex!) of yellow pins, all distributed around Palo Alto's city center, you are using a query length of four or five words. But this doesn't yet feel like conversational intelligence.] That's where I suspect we'll be in 2012, a lot of folks using a lot of simple verbal operators but probably still mostly by keyboard [2008 update: With the release of Google's app for the iPhone this year, it seems possible that voice queries might begin to rival keyboard queries by 2012, as there are so many more phone users than computer users. I'd love to see an internal projection for that].

Now step forward another seven years, to 2019, and in my estimation we will likely have doubled our search length yet again, to just over ten words per average query. Somewhere between 2012 and 2019 I expect we'll see voice recognition queries, most of them mobile/wearable, begin to compete with keyed entry for human-to-machine communication, and a new level of sophistication (and user feedback/ranking/rating) of the average queries. This time, I think we have enough new functionality to create a"step function" in user experience, where the web no longer feels like just an information appliance, it now feels like a partner, a crude extension of our linguistic ability. That's what I would call the end of the Information Age and the start of the Symbiotic Age. From that point forward many of us will begin to feel naked and somewhat stupid out in public without the web, the way we'd feel out in public without our clothes today.

What is our evidence that this query length doubling will continue? It's weak today, but I think still worth watching closely. First, consider that seven years per doubling would be just over the six year doubling in average software productivity or general algorithmic efficiency quoted by Bill Joy and others in the IT industry as a rough "Moore's law for software." But doubled algorithmic power or efficiency alone would be unlikely to translate to doubled query lengths. We will need a lot more insight before we can make this claim.

My main intuition in this regard is that the entire human conversation space (the space of most useful human conversations, regardless of context), while still very much larger than our digital record of it today, is becoming an effectively 'closed' (slow growing, nearing saturation) phase space, what the physicists call 'ergodic', and furthermore that the encoding of human conversation in easily spiderable form is doubling in volume at an enormous rate by comparison (roughly every two years, since the start of the web), and that our ability to rank the relative value of those encodings is also steadily improving (Google's PageRank, Web 2.0, 3.0, etc.).

In other words, I suspect that the human conversation phase space, in all languages, and all digital forms (web publishing, email, audio, video, chat, and other searchable conversations) while still growing slowly in novelty, is becoming effectively, approximately, or statistically computationally closed relative to the rapid mapping of this space by technological intelligence. If this is so, all the most useful and functional thoughts/ideas/sentences in the human mind space are increasingly frequently revisited by technological indexes. The 'map' grows only slowly relative to the 'mapmaking,' which gets finer-grained every year.

If this is true then the hard problem of serving up a useful, semi-intelligent natural language response to an online query is very much like codebreaking, a cryptographic problem that involves finding the set of primers, or translation elements, that are repeatedly used to transform one set of information into another. For the CI, this is the transformation of the world wide web of digitally encoded symbols of use to human beings, into another, the most useful linguistic responses to queries about common human problems. At the same time, new intelligence/information emerges through the associations made during the translation. Codebreaking, like many natural growth processes, follows a logistic curve (an S-curve) in performance over time. Early in the process (the first 'flat' part of the S-curve) it's hard to get the primers. Then you enter into a positive feedback situation where you are getting the primers for the most used words (the 'fat head' of the Zipf's law distribution) and that makes it easy to decode the other high-frequency words. Then you hit the inflection point, having gotten most of the easy words, and you start chasing after increasingly less used words, with less reliable associations to other words, and you've hit the phase of declining returns (the 'top' of the S-curve, saturation, system 'senescence' in performance). But in the first phase of growth, before reaching the inflection point, the performance growth is roughly exponential.

I suspect the saturation point in query length will come at some sentence length that is slightly longer than the average human-to-human query length in spoken sentences. I suggest this because when you query Google, it is often to your advantage, even when asking technical questions today, to include additional words beyond those you'd normally ask any human in natural conversation. You use those specialized words with Google because you suspect (and it is increasingly true) that Google knows "everything," unlike the average human.

If 2020 is our expected transition point, as has seemed most likely to me since first thinking about this issue circa 2000, then CI's are a bit farther off than some of our most optimistic technology futurists would today have us believe. Yet they are also much sooner arriving than the naysayers predict, those who tell us NLP is riddled with near-insoluable problems, and who don't understand how far we've advanced already with simple statistically based systems.

I have heard that Google, for example, has won U.S. NIST's automated language translation competition, over IBM's and other's ontological and mixed systems, by using a relatively simple statistical NLP approach (tied of course to a large and ever-growing online corpus) for at least two years in a row (2005 and 2006).

With leadership, luck, resolve, and exponentially more powerful computing, text analytics, and comunications platforms, we might even be able to accelerate CI development to occur even earlier than the 2020 ETA. In addition to broadband and wireless acess for everyone, I suggest that may be one of the noblest challenges of our generation, in fact.

It seems very likely that we will all soon widely recognize this as the Next Great Leap after the internet. Read on, and let us know if you agree. If you want more speculation on this topic which compares the CI to a network that came before it, you may also enjoy Promontory Point Revisited: The Transcontinental Railroad and the Conversational Interface.

 

The CI Network: An Unrecognized Global Priority for Our Generation

Annually for the last six years, John Brockman's provocative World Question Center at Edge.org has posed an interesting question to roughly 100 edge-thinkers who are committed to integrating both scientific and humanist perspectives on the world. In 2002, they answered a fictional letter from then-President G.W. Bush which asked each of them, as the administration's new science advisor,

"What are the pressing scientific issues for the nation and the world, and what is your advice on how I can begin to deal with them?"

As a developmental futurist, one who expects that a special subset of future events are statistically inevitable and highly predictable, I drafted and sent John my own unsolicited response below.

Clearly the keyboard is a primitive, first-generation interface to our personal computational machines. It gives us information, but not symbiosis. We humans don't twiddle our fingers at each other when we exchange information. We primarily talk, and use a rich repertoire of emotional and body language in simultaneous, noninterfering channels.

We also use our hands to help each other and to manipulate objects ever since the first stone was thrown by the first hominid, so it is also clear that keyboards won't disappear until the human form itself disappears.

In other words, talking is the highest, most natural, and most inclusive form of human communication, and soon our computers everywhere will allow us to interface with them in this new computational domain. I think that achieving this emergence will be one of the greatest of all the technological "Moon Shots" we engage in during our own brief time here on Earth, whether we presently realize it or not.

Science and technology, and broadly, local computation, appear to be asymptotically accelerating, universally-driven phenomena. We are now coming to understand that humanity does not control this developmental process, but rather selectively catalyzes it, ideally with ever increasing social, organizational, and personal foresight.

The entire 20th century demonstrated an astounding, unrelenting, unprecedented double exponential growth in the price performance of our computational machines. At the same time, we have seen new levels of computational autonomy, or human-independence emerge, wherein a rapidly diminishing fraction of human effort is required to produce any fixed amount of computational complexity within each new computing system. These are apparently universal developmental trends, not architected by human design or even desire. For the last decade at least, increasingly evolutionary and biologically inspired forms of computation have become the leading edge of technological development, and will remain so for the foreseeable future.

Referring to the difficulty of technology prediction, Bill Gates reportedly once said "find me the person who predicted the internet, and we'll make him king." This is congruent with a common myth that futurists missed this major development, and certainly many did. The first major "think tank" long range public futures project of the postwar era, The Year 2000: A Framework for Speculation on the Next Thirty-Three Years, Herman Kahn and Anthony Wiener, 1967, certainly missed the decentralization trend, though they did see computing continuing to accelerate. But that only shows the riskiness of relying on one forecasting group to understand the future. Every community has its own biases.

Among the global community there were numerous visionaries who foresaw various pieces of the internet long before it emerged. In 1937, H.G. Wells in "World Brain," articulated the developmental inevitability of a rapidly updating compendium of total world knowledge. In 1945, in "As We May Think," Vannevar Bush proposed the Memex, a proto-hypertext microfiche network that would organize and distribute the world's knowledge, and noted "The advanced arithmetical machines of the future will be electrical in nature, and they will perform at 100 times present [electromechanical relay computer] speeds."

In 1946, just one year into the modern television era, Will F. Jenkins (aka Murray Leinster) in "A Logic Named Joe," predicted "logics," televisions with attached keyboards that were networked by a switching innovation called the "Carson Circuit," that would be used to watch TV, make video phone calls, send and receive telegraphic messages (email), get weather reports, ask research questions, keep books, trade stocks, and play games. Sound like the internet to you? Sure does to me.

The emergence of personal computers was repeatedly predicted by journalists and commentators in the 1950's, and were a longtime goal of electronic hobbyists, who were making successively more complicated home built electronic systems. Peter Drucker predicted our 1980's economic shift to the information economy in the revolutionary Age of Discontinuity, 1968. Alvin Toffler expanded on this and the coming network of "electronic cottages" we would see in the 1990's in The Third Wave, 1980.

In other words, there were many harbingers of the internet for those willing to look, and those who realized that trends in miniaturization, computing, and communication would have to continue to accelerate, because, borrowing from the biological lanaguage of evolution and development, they weren't just evolutionary choices, these particular trends were developmental forces that the universe (our extended environment) was imposing on modern society.

Because most change that occurs in the universe is evolutionary, I believe prediction is generally quite difficult, particularly for those who don't discriminate between evolutionary and developmental dynamics. But developmental processes, when they can be discerned, are surprisingly easy to predict. In the language of complexity studies, they are 'standard attractors', like the hole at the bottom of a basin, or fitness landscape. You cannot predict exactly how the "marble" (the system, evolution, us) is going to get to the bottom of the basin, that is an evolutionary uncertainty, but you know all the evolutionary marbles in the system go through one of the few developmental "holes" available.

To recap, there appear to be two fundamental processes of change at work in all universal systems: evolution and development. The coming CI appears to be, as far as I can determine, a developmental emergence, and we can even measure it's progress, speculate on its enabling and inhibiting factors, and even predict its arrival from past progress, should we choose to do so.

The CI network will not replace the keyboard, as some futurists have incorrectly claimed. For those who today have the education and resources to learn them, keyboards are powerful extensions of human will into the physical space. They will continue to increase in prevalence and sophistication, and will be with us as long as we continue to have biological bodies with ten fingers. Nevertheless, at the same time we can expect that most human computer interaction will move beyond the keyboard, and the ease, power, universality, and sophistication of the CI network will make all our technologies embodied, egalitarian, and symbiotic as never before.

In an evolutionary developmental universe, many evolutionary paths are within our control, challenging us to be good stewards and navigators, but some developmental destinations, such as accelerating local computation, are apparently not, challenging us to be good cartographers, prioritizers, and students of physical dynamics. This phenomenon of continual acceleration, also known as acceleration or singularity studies, is in need of much greater scientific attention. We will almost certainly see the CI network's emergence within our own lifetime. Perhaps the most important remaining questions are how soon, how balanced, and how humanizing will be the path we take toward it.

More CI Details

Fifty years ago, the advent of digital computers moved us from the Industrial Age to the Information Age. But the information age is now getting on in years, and we will soon need a new phrase to capture the meaning of a coming environment where the average human interaction with the average computer is not via keyboard, but by voice. I and others have suggested that the Symbiotic Age is the most appropriate term for this coming era, as it will describe the dominant zeitgeist of the experience—a time when human beings finally feel both significantly empowered by and inseparably connected to their technological infrastructure. A time when anyone on the planet who is comfortable with talking will be cheaply and intuitively connected to the machines around them, when we will start thinking of and talking to our machines as physical entities, flexible to our needs, when complaints and compliments that we have will be relayed to appropriate parties, when the user's vocalizations will be an integral, and eventually, dominant part of the utility of our technology.

Circa 2015-2025, several forecasters expect our natural language processing (most difficult), bandwidth accessibility (less difficult), human simulation and language translation software (even less difficult), and voice recognition software (already here) to each be finally sufficiently powerful, affordable, and ubiquitous that a new type of interface will emerge. Around this time, the majority of human-computer interaction on our global computer network, however we choose to measure it, will shift to a new level of sophistication. That new level will be the move from our present simple keyboard- and mouse-driven, primarily graphical user interface (GUI, or "gooey") into a graphically based, sophisticated mouse and keyboard (including virtual keyboard) utilizing, but primarily conversational interface (CI). A somewhat different definition of the CI's arrival involves that point where the majority of code and hardware behind the average computational interface is designed to interpret human language and intent.

This is the most difficult interface problem presently known, more difficult even than constructing realistic virtual graphical environments, where great and accelerating commercial success (e.g., video games) has occurred over the last decade. CI-era network machines, tools, and services, including educational services, will not simply exist to serve us web databases and graphics, as they do today, but to empower a vast range of intelligent, linguistically guided human-computer interactions, in an organic technological environment where the average human command to the network is delivered verbally, not physically (as via punch card, keyboard, mouse or other physical input device). Think of the opportunities for human development! There are so many new skills, empowerments, services, and products that will evolve from this new capacity that we may rightly consider its full benefit difficult to imagine.

There have been a number of impressive early starts at spoken dialog productivity platforms, like Virtuosity's automatic speech recognition (ASR) voice-activated telephone assistant, Wildfire, a promising but failed commercial effort in the early 2000's, and there are emerging web standards, such VoiceXML. See Wikipedia's entry on Dialog Systems for recent developments. There are promising basic research efforts, such as Microsoft's Natural Language Processing (NLP) Group, the consultancy SRI International (they have a 50 person Artificial Intelligence Center) and NASA Ames, who have developed third generation voice recognition systems now able to identify emotion, punctuation, and other higher level meaning from the prosody (variable pitch, timing, loudness) of a voice stream. SRI's Elizabeth Shriberg projects that software with the ability to extensively decode prosody, as well as to reliably filter out background noise from any real world voice stream, will arrive circa 2012.

In the ubiquitous, mass-affordable CI environment circa 2020, our cellphones, computers, buildings, tools, and websites will finally achieve John Sculley's 1980's "Knowledge Navigator" vision, becoming symbiotic semi-intelligent agents that do ever more helpful intellectual tasks for us in the networked world.

The continued development of better "top-down" computing standards, such as Tim Berners-Lee's/ W3C's semantic web, will be a part of this process. But the major part is likely to be evolutionary developmental and "bottom-up," like Microsoft's MindNet project, involving the integration of ever-smarter artificial neural networks, or their biologically-inspired equivalents, into the "back end" systems running all the tools and technologies we use. Even today, users of early CI systems (directory assistance, flight reservations, etc.) increasingly look forward to each hassle-free upgrade of the back end. Compare this with the mixed feelings we have toward user-guided upgrade processes, and the emerging human-machine symbiosis becomes tangible.We will all play a part, unwitting or not, as this drama unfolds in these final years of "unnatural interface."

 

On Phase Change Singularities: The Nature of CI Emergence

Circa 2020, we may expect a highly useful set of CI-equipped interfaces, built on top of an increasingly parallel but still weakly biologically-inspired set of computer architectures. The CI seems a necessary prerequisite to high-level machine intelligence. Therefore, understanding and measuring the process of CI emergence may give us insight into the dynamics of the technological singularity (generally human-surpassing machine intelligence) to follow.

Why can't some gifted and motivated individual, or perhaps a massive team of individuals (say, Microsoft Research) create an adequate CI using mostly top-down rationally guided design, working in relative isolation from the rest of the communications activity of the planet? Such systems have been tried many times before, and they have predictably been far less valuable than their designers expect. Instead, a much more distributed system transition may be necessary for the CI to emerge.

How must the "Symbiotic Era" of the CI emerge? I'd expect through a massively distributed computational and information storage system that records and analyzes the entire human conversation and behavior space, in all the major spheres of human interest and experience. Furthermore, this system must first conduct multifold creative evolutionary experiments to attempt to construct meaning from this conversation, and in this process a small developmental subset of highly useful natural language processing systems will be created. The emergence of these systems will not be sudden or isolated, but incremental and global, as they are guided and pruned over many years of continuous human conversation with them all across the planet.

How far are we away from being able to create the next generation of such a system? I suggest you watch the development of the current generation, the planetary internet, to search for signs of the CI's emergence. Today, roughly seventy percent of the 200 million daily verbal queries that Google (the most popular search engine on the planet) receives are novel. I have heard that Urs Hölzle at Google wrote a thirty day user query cache circa 2000 but it was not useful, much to the surprise of the company. Too many of the queries at the time were new and unpredictable to the system. When leading search engines begin to cache and do natural language processing on their user queries because most of them are repeating (circa 2015? 2020?), we will know that the human linguistic space has started to become ergodic (a well-explored and frequently repeating phase space). Soon the entire human preference set as expressed in written language, will be, to a first approximation, cataloged and monitored in real-time by our distributed network of computing machines.

This is a form of effective computational closure, a necessary precondition for a phase transition singularity (the CI emergence) to occur. Presently there are at least two problems preventing this evolutionary developmental emergence. The first is that there is not enough memory available to the cache (not simply the last thirty days, but something approaching the entire written history of human inquiry needs to be cacheable by our technological systems, something we can't expect for another decade or two). The second is that there probably is not yet enough global users on the system. Google's 200 million queries/day in 2003 were generated by only a few hundred million regular computer users. As responsible globalization advocates remind us, it is probably safe to say that these users are not yet sufficiently representative of the full interests and inquiries of the six billion people presently on the planet.

You may have heard that Microsoft is recently (2004) launching a major new search software development initiative. This will be critical to the long run economic success of the company, because verbally-driven search is the first generation conversation of humans with machines. It is within the search space that the intelligent internet, and the next generation CI-based operating systems will emerge. Windows 2020 (perhaps better renamed Conversations 2020) will have to be built on such a platform, or Google-like systems will outcompete them for average human use.

Google is becoming a truly unique distributed data processing platform ("GooOS"), and may well in coming years encode a full-featured operating system as an afterthought, rebuilding Windows functionality in a linguistically-driven Google language. Microsoft will have to match Google's distributed CI-based functionality in coming years. To not do so would be to risk being late to the next major reinvention of the planetary computing platform. As futurists Mark Finnern and Wayne Radinsky both note, Rich Skrenta's excellent post "The Secret Source of Google's Power," relates that Google's competitive advantage springs from the features of its hardware network. It is developing a distributed computing platform that, in 2004, "can manage web-scale datasets on 100,000 node server clusters. It includes a petabyte, distributed, fault tolerant filesystem, distributed RPC code, perhaps including network shared memory and process migration. And a datacenter management system which lets a handful of ops engineers effectively run 100,000 servers." That's some impressive automation.

Another way to understand the emergence of the internet-based CI is to explore the historical developmental phases of web search technology: 1) The first wave of web search was created by cheap disk drives (Altavista won this war) 2) The second wave has been created by cheap CPU's and Beowulf cluster networks (Google won this war) 3) the third wave, cheap RAM, may be the next inevitable emergence. Cheap RAM will make both massive fast caches and new, far more complex CI-based algorithms possible. Who will win the third wave? That is an open question, at present.

All of this is not to suggest that human learning will decrease as we approach and enter the CI-enabled Symbiotic Era. To the contrary, our learning will clearly be shifted to a whole new level as all kinds of new collaborative and creative opportunities emerge in CI-driven real and virtual space. I expect the CI to enable our currently laughable digital avatars (digital persona, or "digital me" (DM)) to become both increasingly accurate reflections of the sum of our aspirations (e.g., Lifelogs) and increasingly effective coaches and couselors of our higher selves. Post 2020, I expect my DM will begin doing things in virtual space that are amazing by comparison to what I am doing in physical space. For a fun fictional account of the Symbiotic Era, see my teen-oriented essay "Future Heroes 2035: My Friends and I."

It helps to realize that the biological portion of human species activity is sharply limited by the fixed number of us (6 billion) and fixed speed (200 miles/hour) of communication within biological brains. From the perspective of computers that are growing their capacity exponentially, and learning and recording million times faster than our own brains do (on a range of measures), the entire human phase space appears essentially frozen in spacetime. Capture, closure, and convergence between humans and our digital extensions are the dominant features we can expect, from our perspective. (Things look much more exploratory and dynamic from the perspective of the machines).

In the Symbiotic Era, a time when higher machine intelligence can exist only as a first-level reflection of human aspirations, the most important feature, for planetary intelligence, will be the "Intelligence Amplification" (IA) that increasingly powerful, CI-equipped systems provide to humans who are using them everywhere. I'm presently assuming this era will comprise 30 years, from 2020-2050. That would place the arrival of the next singularity, the Autonomy Era, circa 2050. The latter era would involve the emergence of initially simplistic but eventually strongly biologically-inspired self-improving systems. Because I think such systems must gain their self-awareness through a process of personality capture and co-evolution with human beings, I expect they too will require a nontrivial length of time, in human years, to develop truly complex personalities. This might take perhaps ten human intelligence years, though this would represent a far longer stretch of time in machine intelligence years. That progression would fit with a circa 2060 technological singularity (a developmental "phase change" involving the emergence of human surpassing technological intelligence).

Intelligence amplification (I.A.) systems like the CI network are highly collectivist in their construction, and will be tested and refined by an entire planet's worth of users. They are also highly symbiotic, and will engage in extensive profiling, simulation, human factors internalization, and "personality capture" of their users' behaviors, habits, goals, limits, and rational and emotive states. As our second-generation, post-2020 CI systems increasingly use personality capture techniques to build sophisticated world-models of their users, we may expect that an increasing number of human beings will give their semi-intelligent agents access to and influence over their mind states in an always reversible but progressively more intimate manner. Most accurately, this should be considered a true first-generation form of "uploading," predating the more intensive and invasive forms of uploading that will occur in subsequent iterations of the symbiosis.

Will you choose to let your 2030 machines cheer you up, advise you on your interaction style, or tell you when to take a work break? Today there are robotic toys (AIBO, smart dolls) that already program their users to provide specific emotional responses. The tremendous utility, comfort, and productivity of CI's utilizing personality capture and more specialized tools such as knowledge management (a first-generation electronic forebrain) will compellingly demonstrate to millions of modern skeptics that the developmental destination of human-machine interaction is not some dystopian scenario of computer domination or isolation, but instead an increasingly seamless and symbiotic convergence.

After the Symbiotic Age: Speculations on Autonomy and Beyond

Working from simple log-periodic acceleration models like the developmental spiral, we can argue that the Scientific Age lasted for roughly 380 years (1490-1770), followed by an Industrial Age for 180 years (1770-1950). If these trends continue then today's Information Age will last for only 70 years (1950-2020), and we can expect the coming Symbiotic Age, driven by CI network and biologically inspired computing advances, to last approximately 30 years (2020-2050). Beyond this, if STEM (space, time, energy and matter) efficiency and density of computation and physical transformation continues at past trends, we can forsee a coming developmental singularity, should the structure of spacetime allow continued intelligence acceleration toward the Planck scale.

There are at least two points we might make here. First, if this model is roughly correct, the next fundamentally new era--which we are calling the Symbiotic Age, will occur over a thirty year period between 2020 and 2050. Hierarchical developmental models almost always require that change proceeds in a pattern known as "punctuated equilibrium," brief bursts of new activity followed by longer plateaus of consolidation, and where the final years of one stage are always significantly slower than the early years of the next. This future history we have offered, then, commits us to an expectation that the next twenty years will be primarily a less-remarkable continuation of the groundwork begun fifty years ago, at the birth of the Information Age.

Seen in retrospect then, the period between the emergence of the developmentally inevitable Internet and its next necessary offspring, the CI network, will not likely be considered, from the human perspective, as a period of increasingly dramatic leaps, but rather of many steady, smaller, and less noticable improvements, preparing us for the next great surge of technological change. Borrowing a popular phrase, we can say that the Symbiotic Age will require fifteen to twenty five additional years of slow going and hard work before it suddenly and surprisingly becomes an "overnight success" in the 2020's.

As a second point, consider the apparent dichotomy of the unprecedented scientific and technological acceleration presently seen in first world countries, and their decreasing rate of cultural change as they develop ever finer distinctions in permissible social, politicolegal, and economic behavior. Such deceleration of at least the magnitude of cultural change in the first world strongly suggests we are rapidly moving toward an end-stage, saturation phase of development in the human computational substrate. In other words, there's not much more social optimizing left that can be easily done by human beings running human societies. As Francis Fukuyama (The End of History, 1992) and others have observed, all the world's governments drift closer every year to a scientific, capitalistic, democratic final common pathway for social development.

In this new world the average citizen, having a sharply finite capacity for change absorption, increasingly insulates their social and cultural consciousness from the developmental hurricanes occuring in the technological systems around them. Technological acceleration continues unabated, it just increasingly proceeds "under the hood" of the car of change, so to speak. Think about all the computation that went into the creation of an advanced hybrid automobile like the Toyota Prius, for example, and how oblivious we are to that, vs. car technology of the 1950's.

So does this mean that the total amount of socioeconomic change must slow down in the Symbiotic Age? Hardly. We only need to lose our first world bias to understand the unprecedented nature of the changes to come. If billions of presently marginalized human beings are uplifted toward first world socioeconomic status in coming decades, to a place where only tens or hundreds of millions have gone before, this will still represent massively unprecedented socioeconomic change for the average distributed complexity of humanity on Earth. We can expect these changes, given current trends, even as first world culture becomes increasingly canalized (comfortably settled), social-benefits oriented, and regulated with every passing year.

This new post-CI growth spurt of globalization will be tempered somewhat by technological systems that allow us to increasingly preserve and maintain existing cultural histories, and with less fidelity, cultural differences. It will occur as human individual and cultural consciousnesses become steadily better at either insulating themselves from, or balancing themselves within, the accelerating computational change occuring all over the planet. Globalization debates are today often framed in terms of how to help the third world rise to first world standards. By mid century, they are likely to be framed in terms of how to help societies of every type become more change-seeking, versus change-averse, with regard to many powerful new opportunities for human-machine symbiosis. "Symbiozation," not globalization, sounds like the dominant cultural agenda, in addition to refining globalization, in an era of late 21C economic abundance, to be far more equitable and pluralistic, and in slowly demilitarizing a planet that may finally have sufficient transparency and trust to allow flatter and more bottom-up rather than top-down systems of global security.

If we wish to be acceleration-aware in our forecasting, we may not be done yet. During the Symbiotic Age, the most successful of our CI's will likely incorporate substantial bottom-up-developed biologically inpired evolvable hardware (EHW) components, as well as a wide range of scanned and reverse-engineered structural architectural elements of metazoan neural networks within their differentiating body plans. Somewhere within this process they will begin to develop high-level, scalable, and robust ability to direct their own self-improvement, self-repair, and self-generation (e.g., limited self-replication, variation, and selection of even high-level neural architectures).

This vision argues that we must take a highly distributed, incremental, and network-centric developmental route to the "Planetization" of humanity, a concept eloquently envisioned by Teilhard de Chardin in 1945. (See "The Planetisation of Mankind" in The Future of Man). The strong claim that I wish to make is that, just as linguistic AI will require a planetary network of human beings, incrementally tuning up the conversational interface (CI) to human-level utility, so too it will require a planetary network of human beings tuning up all our robotic systems to produce a broad collection of utility robots that are be sufficiently situationally intelligent to interact in the human environment. This problem is not an easy one. Like the CI, it is extremely complex, and robots will be incredibly stupid for decades. Only the input and detailed, collective feedback of their user-gardeners will allow them to become less than stupid.

Generally, I call this perspective the "95/5% Rule" and it's the idea that all major substrate transitions seem to be primarily "95%" bottom-up and experimental, and only slightly "5%" guided by top-down, hierarchical or developmental control. It suggests that the primary role of the average 21C human being, from technology's perspective, will be as a trainer and gardener of the growth of tomorrow's intelligent technologies, just as we today are socially constructing the wired and wireless participatory web.

Perhaps we will choose to allow various forms of increasingly autonomous self-replication, with appropriate safeguards, because the adaptive machines they empower will be natural incremental extensions of the machine learning paradigms presently in existence, and because such extensions will demonstrate dramatically greater human utility, as well as ever more self-balancing and statistically safe behaviors in the vast majority of artificial selection environments. But perhaps most importantly, our increasing understanding of biological, cultural, and technological immune systems, systems that were poorly understood by early 21st century thinkers, will allow us proceed with growing wisdom.

As we begin to collectively train our machines, we can and should expect that any local catastrophes that do occur (e.g., unpredictably behaving, unsafe learning agents) will be rigorously contained by any redundant, fault-tolerant, healthy immune architecture. We will come to realize that those micro-catastrophes that do occur, within healthy immune environments, can only catalyze immune learning, increasing the "average distributed complexity" of the system, as well as general system intelligence. This increasing informational immunity appears to be one of the great hidden mechanisms that has guaranteed the accelerating hierarchical emergence of computational substrates in universal history.

Within a few short years of this new self-directing, self-replicating capacity, we will begin to suspect that our increasingly self-modelling tools and agents—and by association, "we," as a human-machine social network—are a good deal more intelligent than surface appearances indicate. We can call that next era, an age of increasingly self-directing symbiotic machine interfaces, the Autonomy Age. By comparison with previous ages, it may last as little as 10 years (perhaps 2050-2060), taking us to the edge of a circa 2060 technological singularity, in this very simplistic model.

At some point after this we may expect such a high level of integration with our twins that they become conscious extensions of our biological selves. If the patterns of neural synchronization which apparently generate our conscious perception in brains can exist within both biological and digital networks, as may very well be physically possible, then we could expect no subjective cessation of conscious experience on the death of our biological self. Furthermore, as our digital self will have both an indefinite lifespan and the ability to backup and continously fork and reintegrate versions of itself, no significant information destruction will have occurred in the loss of the biological body and brain. As our computer technology continues its accelerative and increasingly autonomous growth, these critical patterns will have been progressively uploaded into our twin. From the perspective of planetary complexity, this would represent a major transition in evolutionary development. An irreversible developmental substrate shift (biological to technological) will have occurred, for the leading edge of complexity on our planet. For more, on what might happen next, see my 2002 paper, Answering the Fermi Paradox, or its update, The Transcension Hypothesis, 2011.

Greg Stock (in Metaman, 1993) Vernor Vinge, 1993, and others have written eloquently on the coming technological singularity. But few to date have considered the requisite ethical constraints that must emerge naturally, in a 95% bottom-up, evolutionary fashion, within self-directing technological systems that have many orders of magnitude greater learning capacity than biological brains. We may understand ethics as a form of behavioral immunity that protects an otherwise precarious accelerating intelligence development, and no known intelligent systems exist on Earth without the presence of a competent, healthy, overarching immune system. Thinking about immunity, and helping it emerge naturally as complexity scales, is thus one of our greatest opportunities and challenges in coming decades.

What an amazingly innovative and privileged time to be alive.

Thanks for reading. As always, I appreciate your comments, critiques, fixes, and feedback at: johnsmart{at}accelerating{dot}org.