According to the Times, true artificial intelligence is just around the corner. A year ago, the paper ran a front-page story about the wonders of new technologies, including deep learning, a neurally-inspired A.I. technique for statistical analysis. Then, among others, came an article about how I.B.M.’s Watson had been repurposed into a chef, followed by an upbeat post aboutquantum computation. On Sunday, the paper ran a front-page story about “biologically inspired processors,” “brainlike computers” that learn from experience.
This past Sunday’s story, by John Markoff, announced that “computers have entered the age when they are able to learn from their own mistakes, a development that is about to turn the digital world on its head.” The deep-learning story, from a year ago, also by Markoff, told us of “advances in an artificial intelligence technology that can recognize patterns offer the possibility of machines that perform human activities like seeing, listening and thinking.” For fans of “Battlestar Galactica,” it sounds like exciting stuff.
But, examined carefully, the articles seem more enthusiastic than substantive. As I wrote before, the story about Watson was off the mark factually. The deep-learning article had problems, too. Sunday’s story is confused at best; there is nothing new in teaching computers to learn from their mistakes. Instead, the article seems to be about building computer chips that use “brainlike” algorithms, but the algorithms themselves aren’t new, either. As the article notes in passing, “the new computing approach” is “already in use by some large technology companies.” Mostly, the article seems to be about neuromorphic processors—computer processors that are organized to be somewhat brainlike—though, as the article points out, they have been around since the nineteen-eighties. In fact, the core idea of Sunday’s article—nets based “on large groups of neuron-like elements … that learn from experience”—goes back over fifty years, to the well-known Perceptron, built by Frank Rosenblatt in 1957. (If you check the archives, the Timesbilled it as a revolution, with the headline “NEW NAVY DEVICE LEARNS BY DOING.” The New Yorker similarly gushed about the advancement.) The only new thing mentioned is a computer chip, as yet unproven but scheduled to be released this year, along with the claim that it can “potentially [make] the term ‘computer crash’ obsolete.” Steven Pinker wrote me an e-mail after reading the Times story, saying “We’re back in 1985!”—the last time there was huge hype in the mainstream media about neural networks.
What’s the harm? As Yann LeCun, the N.Y.U. researcher who was just appointed to run Facebook’s new A.I. lab, put it a few months ago in a Google+ post, a kind of open letter to the media, “AI [has] ‘died’ about four times in five decades because of hype: people made wild claims (often to impress potential investors or funding agencies) and could not deliver. Backlash ensued. It happened twice with neural nets already: once in the late 60’s and again in the mid-90’s.”
A.I. is, to be sure, in much better shape now than it was then. Google, Apple, I.B.M., Facebook, and Microsoft have all made large commercial investments. There have been real innovations, like driverless cars, that may soon become commercially available. Neuromorphic engineering and deep learning are genuinely exciting, but whether they will really produce human-level A.I. is unclear—especially, as I have written before, when it comes to challenging problemslike understanding natural language.
The brainlike I.B.M. system that the Times mentioned on Sunday has never, to my knowledge, been applied to language, or any other complex form of learning. Deep learning has been applied to language understanding, but the results are feeble so far. Among publicly available systems, the best is probably a Stanford project, called Deeply Moving, that applies deep learning to the task of understanding movie reviews. The cool part is that you can try it for yourself, cutting and pasting text from a movie review and immediately seeing the program’s analysis; you even teach it to improve. The less cool thing is that the deep-learning system doesn’t really understand anything.
It can’t, say, paraphrase a review or mention something the reviewer liked, things you’d expect of an intelligent sixth-grader. About the only thing the system can do is so-called sentiment analysis, reducing a review to a thumbs-up or thumbs-down judgment. And even there it falls short; after typing in “better than ‘Cats!’ ” (which the system correctly interpreted as positive), the first thing I tested was a Rotten Tomatoes excerpt of a review of the last movie I saw, “American Hustle,” “A sloppy, miscast, hammed up, overlong, overloud story that still sends you out of the theater on a cloud of rapture.” The deep-learning system couldn’t tell me that the review was ironic, or that the reviewer thought the whole was more than the sum of the parts. It told me only, inaccurately, that the review was very negative. When I sent the demo to my collaborator, Ernest Davis, his luck was no better than mine. Ernie tried “This is not a book to be ignored” and “No one interested in the subject can afford to ignore this book.” The first came out as negative, the second neutral. If Deeply Moving is the best A.I. has to offer, true A.I.—of the sort that can read a newspaper as well as a human can—is a long way away.
Overhyped stories about new technologies create short-term enthusiasm, but they also often lead to long-term disappointment. As LeCun put it in his Google+ post, “Whenever a startup claims ‘90% accuracy’ on some random task, do not consider this newsworthy. If the company also makes claims like ‘we are developing machine learning software based on the computational principles of the human brain’ be even more suspicious.”
As I noted in a recent essay, some of the biggest challenges in A.I. have to do with common-sense reasoning. Trendy new techniques like deep learning and neuromorphic engineering give A.I. programmers purchase on a particular kind of problem that involves categorizing familiar stimuli, but say little about how to cope with things we haven’t seen before. As machines get better at categorizing things they can recognize, some tasks, like speech recognition, improve markedly, but others, like comprehending what a speaker actually means, advance more slowly. Neuromorphic engineering will probably lead to interesting advances, but perhaps not right away. As a more balanced article on the same topic in Technology Review recently reported, some neuroscientists, including Henry Markram, the director of a European project to simulate the human brain, are quite skeptical of the currently implemented neuromorphic systems on the grounds that their representations of the brain are too simplistic and abstract.
As a cognitive scientist, I agree with Markram. Old-school behaviorist psychologists, and now many A.I. programmers, seem focussed on finding a single powerful mechanism—deep learning, neuromorphic engineering, quantum computation, or whatever—to induce everything from statistical data. This is much like what the psychologist B. F. Skinner imagined in the early nineteen-fifties, when he concluded all human thought could be explained by mechanisms of association; the whole field of cognitive psychology grew out of the ashes of that oversimplified assumption.
At times like these, I find it useful to remember a basic truth: the human brain is the most complicated organ in the known universe, and we still have almost no idea how it works. Who said that copying its awesome power was going to be easy?
Gary Marcus is a professor of psychology at N.Y.U. and a visiting cognitive scientist at the new Allen Institute for Artificial Intelligence. This essay was written in memory of his late friend Michael Dorfman—friend of science, enemy of hype.