Making the Web More Intelligent

Stephen DeAngelis

August 14, 2012

About a year and a half ago, Steve Lohr wrote, “When problems are nuanced or ambiguous, or require combining varied sources of information, computers are no match for human intelligence.” [“Aiming to Learn as We Do, a Machine Teaches Itself,” New York Times, 4 October 2010] The reason for that is that nuanced and ambiguous problems often involve language or semantics. Lohr explained:

“Few challenges in computing loom larger than unraveling semantics, understanding the meaning of language. One reason is that the meaning of words and phrases hinges not only on their context, but also on background knowledge that humans learn over years, day after day.”

That is why Web 3.0, the so-called semantic web, has been so elusive. Karsten Strauss reports that an entrepreneur named “Lars Hard feels he has seen the future of search, information gathering and the web in general, and it is artificial intelligence.” [“Artificial Intelligence is the Next Step in Search (and everything else),” Forbes, 29 June 2012] Strauss believes a better term for what Hard envisions as the future of the web is “computational intelligence.” He defines computational intelligence as “the ability to gather information or find your destination on the web faster and more efficiently with an artificial search partner that can predict what you want.” He continues:

“The number of search engines, the myriad pathways from a user’s inquisitive mind to the answer he or she seeks requires that someone, or some thing, must learn the complexities of people’s search patterns in order to better serve the searcher. ‘Everything on the internet will have to get more intelligent,’ said Hard. Siri is a very good example of what we’re going to see more of in the future,’ said Hard, who has set up his own company, ExpertMaker, to develop software for artificial intelligence. The company’s platform consists of a server, toolkit and API that provides its clients with technology for upgrading existing, or building new, AI products. Hard recounts that when Google was a young company, he and others were using Alta Vista as their search engine. Google’s offering was not hugely better, but the incremental increase in qualitative results was felt by users and they responded by using the now giant search company. AI could be that next step in the evolution of search, Hard said.”

As Strauss notes, the key feature of an intelligent search engine is learning “the complexities of people’s search patterns.” I agree with the premise that learning systems represent the future. In Lohr’s article cited above, Lohr focused on a learning system created at Carnegie Mellon University with grant support from the Defense Advanced Research Projects Agency and Google. The system is called the Never-Ending Language Learning system, or NELL. Lohr reported, “The computer was primed by the researchers with some basic knowledge in various categories and set loose on the Web with a mission to teach itself.” Lohr continued:

“NELL has made an impressive showing so far. NELL scans hundreds of millions of Web pages for text patterns that it uses to learn facts, 390,000 to date, with an estimated accuracy of 87 percent. These facts are grouped into semantic categories — cities, companies, sports teams, actors, universities, plants and 274 others. The category facts are things like ‘San Francisco is a city’ and ‘sunflower is a plant.’ NELL also learns facts that are relations between members of two categories. For example, Peyton Manning is a football player (category). The Indianapolis Colts is a football team (category). By scanning text patterns, NELL can infer with a high probability that Peyton Manning plays for the Indianapolis Colts — even if it has never read that Mr. Manning plays for the Colts. ‘Plays for’ is a relation, and there are 280 kinds of relations. The number of categories and relations has more than doubled since earlier this year, and will steadily expand. The learned facts are continuously added to NELL’s growing database, which the researchers call a ‘knowledge base.'”

Let’s hope it doesn’t NELL too much time to learn that Peyton Manning has switched teams. Tom M. Mitchell, a computer scientist and chairman of Carnegie Mellon’s machine learning department, told Lohr that “a larger pool of facts … will help refine NELL’s learning algorithms so that it finds facts on the Web more accurately and more efficiently over time.” Lohr continued:

“NELL is one project in a widening field of research and investment aimed at enabling computers to better understand the meaning of language. Many of these efforts tap the Web as a rich trove of text to assemble structured ontologies — formal descriptions of concepts and relationships — to help computers mimic human understanding. The ideal has been discussed for years, and more than a decade ago Sir Tim Berners-Lee, who invented the underlying software for the World Wide Web, sketched his vision of a ‘semantic Web.'”

Strauss reports that Hard believes the transition to a semantic web will take place within the next decade. As a result, “Hard predicts the web will shift from just serving static images and a lot of unstructured text.” Strauss continues:

“In search, social and other categories of information, we’re seeing an information overload which will be replaced by a more computational nature and ‘reasoning’ that can help users sift through a massive ocean of information. For example, inputting data in a search will be remembered, combined with any other basic information a search engine can glean about a user and build a ‘memory’ of statistics and patterns that will allow it to arrive at a likely informational destination faster. In ecommerce, Hard said, no longer will several key phrases lead a user to the same top ten listings on Google or Bing; the user will be brought to those online merchants that have the most specific product being sought. Other applications include medical diagnostics, he added.”

I’m glad that Strauss and Hard pointed out that the pursuit of a better search engine isn’t all about the web. I believe that learning systems will have their greatest impacts in other areas (like medicine) and that they will positively affect the quality of life for the majority of earth’s inhabitants. Strauss concludes:

“Hard’s vision of the future starts to become really complex when you try to fathom the idea of Siri-like AI personal assistants, using a handful of other AI search engine/info-sifters, which in turn are relying another handful of AI agents. As I myself imagine it, I’m reminded of connections and communications between brain synapses and wonder how massive a global info network could get. As fast as technologies – and technologists like Hard – move forward, we’re probably still safe from dealing with actual thinking machines. Or are we?”

Dominic Basulto thinks he sees an answer to that question. He writes, “The accelerating pace of technological change is leading to the creation of entirely new opportunities for humans to ‘play God’ — to create and transform life in a way that has never been possible. What was once thought to be the exclusive realm of a higher power is now within the grasp of human beings.” And the catalyst for this ability to play God, he writes, is artificial intelligence. [“How we’re playing God now,” Washington Post, 29 June 2012] “Artificial intelligence (AI) researchers,” he writes, “are creating advanced forms of machine learning that rival human intelligence.” Basulto reports on an interesting experiment that involved only images of cats. He writes:

“Machines, as you’ve probably noticed, have been getting smarter. The most recent advance: Google’s ‘Cat Experiment’, in which 16,000 computers hooked up to a vast neural network learned to recognize the concept of a ‘cat’ after being shown over 10 million digital photos of cats. This marks a fundamental breakthrough in artificial intelligence for one simple reason: the computers were never told what a ‘cat’ was before the experiment started and were not given a series of human rules for recognizing cats. The computers arrived at an abstract conceptualization of ‘cat’ the same way an infant might arrive at an abstract conceptualization of ‘cat’ before knowing what the word means. It’s the difference between teaching a computer the rules of how to play chess, and a bunch of computers spontaneously arriving at the very concept of chess — and then coming up with a way to win. Yes, the machines are, for all intents and purposes, alive.”

Basulto concludes, “The accelerating pace of technological development certainly seems to be pointing to a future that is infinitely more complex and varied than we ever thought possible.” And that future is being made possible through advances in learning systems. Basulto discusses numerous areas in which learning systems are permitting researchers to “play God.” But I’ll save that discussion for another post. The learning machines that most people will have direct interaction with will be attached to Web and accessed through personal devices. If Lars Hard is correct, a future that includes the semantic web is not too far over the horizon.