In 2012, artificial-intelligence researcher Geoffrey Hinton was talking about the weirdly compelling nonsense his predictive-text engine was spitting out. Now he’s talking about “weirdly compelling common sense” instead.
Hinton, the so-called grandfather of the branch of AI known as “deep learning,” spent much of his career as an outside-the-mainstream researcher at the University of Toronto. In the late ’00s, though, deep learning — whose algorithms use a multi-layered approach to extract increasingly high-level features from raw data — became the darling of the tech sector, forming the foundation for everything from Apple’s Siri to Google’s reverse-image search service. Today, Hinton is a vice-president at Google, the chief scientific adviser at Toronto’s Vector Institute, and a newly minted co-recipient of the Turing Prize, the most prestigious award in computer science.
As much as Hinton’s life has changed in the past decade, predictive-text engines have changed even more. A predictive engine “trains” on millions or billions of words in a source text known as a “corpus.” It typically analyzes patterns at the level of letters and characters (rather than words, sentences, or paragraphs), building up a “language model” that allows it to write its own material.
Stay up to date!
Get Current Affairs & Documentaries email updates in your inbox every morning.
One way to test these engines involves a process akin to an improv-comedy exercise: Someone, generally a researcher, throws out an idea in the form of a few starting words or a sentence. The engine riffs on that opener, creating original text, letter by letter, based on its understanding of the patterns it absorbed from the training corpus.
In 2012, the best engine in Hinton’s lab could assemble subjects, verbs, and objects, together with some in-the-ballpark punctuation, into structurally sound sentences. But the content was meaningless — a stream of nonsense with no conceptual continuity from one word to the next.
Type in the words “Geoffrey Hinton is” and the algorithm would spit back lines such as “Geoffrey Hinton is a man apart,” “Geoffrey Hinton is a nationally lost artist,” and “Geoffrey Hinton is a 3rd soloist, raiding trumpeter and the true genius.”
The effect was eerie. The algorithm seemed to tap into some deep aspect of language structure, even if it couldn’t yet say something meaningful.
Earlier this year, the California-based research company OpenAI (co-founded by Elon Musk) released a new predictive-text generator. Hinton was not involved with the OpenAI model, but it built directly on his earlier work.
Ilya Sutskever, OpenAI’s research director and a former graduate student of Hinton’s, says its new language model is so powerful, the company held back part of the source code for fear it would be used maliciously.
“The thinking was, you could generate fake news articles, automatically create convincing harassment complaints online,” Sutskever says. “We were unwilling to release the parameters of the model.”
OpenAI testers gave the new engine an absurdist opener:
“In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.”
The engine returned with this: “The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science. Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved. Dr. Jorge Perez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans.
“Perez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow. Perez and the others then ventured further into the valley.
“‘By the time we reached the top of one peak, the water looked blue, with some crystals on top,’ said Perez. Perez and his friends were astonished to see the unicorn herd. These creatures could be seen from the air without having to move too much to see them – they were so close they could touch their horns.”
The story continues for several more paragraphs. It’s not perfect. The opener makes it sound as though talking unicorns were a recent discovery, and the follow-up implies that the mystery of their existence has been pondered for centuries. Also, how can a unicorn have four horns?
But to catch even these obvious slips, a reader would have to be paying attention. This technology could easily turn out passingly convincing conspiracy theories, political slander, and hate propaganda on a massive scale.
How did these engines get so good so fast? Sutskever puts it down simply as a matter of more computing power and larger training sets.
“The rapid progress is 100 per cent attributable to the fact that they get better when they get bigger,” he says. “And they will keep getting bigger. There is no upper limit.”
Hinton says that the engine also now has new ways to “remember” what it’s talking about.
“There have been some technical changes that allow it to stay on topic,” he says. “The ‘Bolivian Unicorn’ story goes on for many paragraphs, staying on the same theme as the original lead-in. It keeps saying things that are relevant to things it said previously.”
Computers don’t mimic the specific mechanisms by which a human brain stores short-term memories, but they achieve a similar effect — “in a clumsy way,” Hinton says — by storing patterns in a type of artificial-neural-network architecture known as a “transformer.”
Transformers effectively give artificial intelligences a longer attention span, helping them keep track of what they were just talking about and relate their past experiences to the present.
Predictive engines can do more than generate text. Train one on a music corpus, and it can compose in the style of Bach, Mozart, or Jagger. And they’re also getting better at creating visual art.
Fake news aside, predictive engines also are poised to upend many creative industries.
“On one hand, there’s going to be an unbelievable variety and quantity of content. You think there’s a lot of content now, but there will be much more,” says Sutskever. “I think the task of being a creative person — a writer or an artist — will change. Some things will get harder, and others will get easier. You could write a novel much more quickly. You give the computer your idea, and it could help you write some parts.”
Deep learning is also driving precipitous improvements in computers’ ability to carry on conversations and to ask and answer questions — the kind of skills needed in everything from customer support to medical diagnosis.
Priyanka Agrawal, a senior research engineer at IBM Research, says the full potential of deep learning remains unclear.
“The deep-learning wave was sudden,” she says. “Many of the classical machine-learning algorithms that were used extensively were replaced by deep-learning models. However, one needs to understand there is still a long way to go.”
Agrawal works on “natural language processing” applications — essentially teaching computers to talk to people flexibly, casually, and informally.
“The limitations of deep-learning algorithms are primarily with their ability to generalize across a wide variety of tasks,” she says. “There are models specifically to consider sarcasm, puns, emojis, etc., but nothing that can factor all this in and arrive at meaningful conversations as a whole.”
But Agrawal is describing current limitations, not future ones.
“I think we have barely scratched the surface as to what is possible with deep learning,” she says.
In addition to getting bigger, language models are also starting to graduate from letter-by-letter analysis to larger chunks of language. This development is a step up in sophistication, both in terms of training and potential results.
Analyzing letter patterns requires only that an algorithm process a few dozen elements: upper and lowercase letters, the numbers zero through 10, and a small collection of punctuation marks, spaces, return carriages, etc. Dealing with “language fragments” is a far more complicated business.
“You take a language and find the 32,000 most common character strings,” says Hinton, who keeps current on developments in language models although his own research now focuses more on the visual aspects of deep learning.
“You say, how can I find 32,000 character strings that let you express any string in this language?”
In English, the fragments would include all the individual letters (necessary for parsing proper names that don’t fit standard language patterns), most prefixes and suffixes, and many common combinations such as “the,” “str,” “ph,” and “and.” The algorithm breaks the corpus into these fragments during training, and the engine makes its predictions using the same units.
With emerging research avenues like this, neither Hinton, Agrawal, nor Sutskever expects the improvements in predictive text to hit a wall in the foreseeable future. All acknowledge the dangers of really good deep learning: Sutskever talks about a tidal wave of fake news; Agrawal mentions privacy-invading facial-recognition applications; Hinton has spent his entire career fretting about military applications. (He has always refused military research funding, although he says he’s aware that knowledge he has helped create can be used to create autonomous machines of war.)
Even with these cautions and caveats, though, they still skew utopian.
“Like the railroad, automobiles, and computers, deep learning has tremendous potential to transform the world into a better place,” says Agrawal.
Hinton says his research in visual AI could help make machines better than human technicians at reading medical scans and diagnosing health problems earlier.
And Sutskever’s hopes for AI are wide-ranging.
“I do think machines eventually should make our economy so productive it will make a true world of abundance,” says Sutskever. “I look forward to not only automating the creation of text and art, but also to medical diagnostics, medical research, nursing giving us very cheap health care.”