Artificial Intelligence: The Quest for Machines that Think Like Humans, Part 3
February 01, 2012
This is final segment of a three-part series on artificial intelligence. In Part 1, I discussed work being done at IBM, supported by funding from DARPA, related to the development of cognitive computers. In Part 2, I discussed work being done elsewhere. In this segment, I indicated that I would discuss work being conducted at Carnegie Mellon University on the Never-Ending Language Learning system, or NELL. Having had an association with Carnegie Mellon for a number of years, I know firsthand how talented the people associated with that institution can be. The reason that an article in The New York Times about NELL caught my eye is because NELL deals with language and semantics. My company, Enterra Solutions, maintains a partnership with another group that focuses on semantics and ontology, Cycorp, a “leading provider of semantic technologies that bring a new level of intelligence and common sense reasoning to a wide variety of software applications. The Cyc® software combines an unparalleled common sense ontology and knowledge base with a powerful reasoning engine and natural language interfaces to enable the development of novel knowledge-intensive applications.” More on that later.
In the Times‘ article, Steve Lohr writes, “Few challenges in computing loom larger than unraveling semantics, understanding the meaning of language. One reason is that the meaning of words and phrases hinges not only on their context, but also on background knowledge that humans learn over years, day after day.” [“Aiming to Learn as We Do, a Machine Teaches Itself,” 4 October 2010] Lohr is absolutely correct. That is why analyzing unstructured Big Data can be such a challenge and why my company uses Cyc software. Lohr continues:
“A team of researchers at Carnegie Mellon University — supported by grants from the Defense Advanced Research Projects Agency and Google, and tapping into a research supercomputing cluster provided by Yahoo — has been fine-tuning a computer system that is trying to master semantics by learning more like a human. Its beating hardware heart is a sleek, silver-gray computer — calculating 24 hours a day, seven days a week — that resides in a basement computer center at the university, in Pittsburgh. The computer was primed by the researchers with some basic knowledge in various categories and set loose on the Web with a mission to teach itself. ‘For all the advances in computer science, we still don’t have a computer that can learn as humans do, cumulatively, over the long term,’ said the team’s leader, Tom M. Mitchell, a computer scientist and chairman of the machine learning department.”
According the Lohr, NELL “has made an impressive showing so far.” He explains:
“NELL scans hundreds of millions of Web pages for text patterns that it uses to learn facts, 390,000 to date, with an estimated accuracy of 87 percent. These facts are grouped into semantic categories — cities, companies, sports teams, actors, universities, plants and 274 others. The category facts are things like ‘San Francisco is a city’ and ‘sunflower is a plant.’ NELL also learns facts that are relations between members of two categories. For example, Peyton Manning is a football player (category). The Indianapolis Colts is a football team (category). By scanning text patterns, NELL can infer with a high probability that Peyton Manning plays for the Indianapolis Colts — even if it has never read that Mr. Manning plays for the Colts.”
Inference is an important attribute for any learning system. Cyc also has an inference engine that can perform “general logical deduction (including modus ponens, modus tollens, and universal and existential quantification), with AI’s well-known named inference mechanisms (inheritance, automatic classification, etc.) as special cases.” One of the big advantages of Cyc is that it can scale in ways that other approaches can’t. Lohr continues:
“The number of categories and relations [in NELL] has more than doubled since earlier this year, and will steadily expand. The learned facts are continuously added to NELL’s growing database, which the researchers call a ‘knowledge base.’ A larger pool of facts, Dr. Mitchell says, will help refine NELL’s learning algorithms so that it finds facts on the Web more accurately and more efficiently over time. NELL is one project in a widening field of research and investment aimed at enabling computers to better understand the meaning of language. Many of these efforts tap the Web as a rich trove of text to assemble structured ontologies — formal descriptions of concepts and relationships — to help computers mimic human understanding. The ideal has been discussed for years, and more than a decade ago Sir Tim Berners-Lee, who invented the underlying software for the World Wide Web, sketched his vision of a ‘semantic Web.’ Today, ever-faster computers, an explosion of Web data and improved software techniques are opening the door to rapid progress.”
Lohr notes that no single approach dominates the field. Besides small players like CMU, semantic research is being conducted by “government labs, Google, Microsoft, [and] I.B.M. Lohr briefly mentions IBM’s Watson computer that won at “Jeopardy!” He states that Watson “shows remarkable semantic understanding in fields like history, literature and sports.” As I’ve noted before, Watson doesn’t do so well when questions involve ambiguity or nuance. Lohr continues:
“Google Squared, a research project at the Internet search giant, demonstrates ample grasp of semantic categories as it finds and presents information from around the Web on search topics like ‘U.S. presidents’ and ‘cheeses.’ Still, artificial intelligence experts agree that the Carnegie Mellon approach is innovative. Many semantic learning systems, they note, are more passive learners, largely hand-crafted by human programmers, while NELL is highly automated. ‘What’s exciting and significant about it is the continuous learning, as if NELL is exercising curiosity on its own, with little human help,’ said Oren Etzioni, a computer scientist at the University of Washington, who leads a project called TextRunner, which reads the Web to extract facts. Computers that understand language, experts say, promise a big payoff someday. The potential applications range from smarter search (supplying natural-language answers to search queries, not just links to Web pages) to virtual personal assistants that can reply to questions in specific disciplines or activities like health, education, travel and shopping.”
Apple’s Siri software is giving iPhone 4s owners a taste of what is possible. Alfred Spector, vice president of research for Google, told Lohr, “We’re on the verge now in this semantic world.” Lohr continues:
“With NELL, the researchers built a base of knowledge, seeding each kind of category or relation with 10 to 15 examples that are true. In the category for emotions, for example: ‘Anger is an emotion.’ ‘Bliss is an emotion.’ And about a dozen more. Then NELL gets to work. Its tools include programs that extract and classify text phrases from the Web, programs that look for patterns and correlations, and programs that learn rules. For example, when the computer system reads the phrase ‘Pikes Peak,’ it studies the structure — two words, each beginning with a capital letter, and the last word is Peak. That structure alone might make it probable that Pikes Peak is a mountain. But NELL also reads in several ways. It will mine for text phrases that surround Pikes Peak and similar noun phrases repeatedly. … NELL, Dr. Mitchell explains, is designed to be able to grapple with words in different contexts, by deploying a hierarchy of rules to resolve ambiguity. This kind of nuanced judgment tends to flummox computers.”
By now you should understand that rules are important. Given a few simple rules, computers can do some amazing things (even if those activities are a far cry from full cognition). Mitchell told Lohr that the surprising thing is that a system like NELL “works much better if you force it to learn many things, hundreds at once.” The ultimate purpose of NELL, he explained, is to provide “a foundation for improving machine intelligence.” Mitchell admits that NELL still has trouble with nuanced information. He explains:
“Take two similar sentences, he said. ‘The girl caught the butterfly with the spots.’ And, ‘The girl caught the butterfly with the net.’ A human reader, he noted, inherently understands that girls hold nets, and girls are not usually spotted. So, in the first sentence, ‘spots’ is associated with ‘butterfly,’ and in the second, ‘net’ with ‘girl.’ ‘That’s obvious to a person, but it’s not obvious to a computer,’ Dr. Mitchell said. ‘So much of human language is background knowledge, knowledge accumulated over time. That’s where NELL is headed, and the challenge is how to get that knowledge.'”
Lohr reports that a little human intervention is still required to keep NELL learning correctly. For example, NELL wanted to put “Internet cookies” into the “baked goods” category. One mistake like that can have cascading effects throughout the knowledge base. Once get relationships and semantics correct, Cycorp notes, there are many uses for refined knowledge bases; such as: medical research, counterterrorism analysis, financial analysis, intelligent information dissemination, knowledge source integration, and network intrusion protection.
Enterra is obviously using Cyc’s knowledge base to help optimize supply chains. At the heart of Enterra’s approach is an artificial intelligence (AI) knowledge-base that includes an ontology and extended business rules capable of advanced inference. Ontology interrelates concepts and facts with many-to-many relationships that are generationally more advanced and appropriate for artificial intelligence applications than standard relational databases. Ontology:
• Shares common understanding of the structure of information.
• Enables reuse of domain knowledge.
• Makes domain assumptions explicit and allows for encoding subtle and rich, multi-faceted relationships.
• Separates domain knowledge from the operational knowledge.
• Analyzes information and can expose non-obvious relationships.
In an Ontology, relationships can be very rich and can be used to model the complexities of real-world relationships. The relationships can include subtleties such as usually, sometimes, frequent, and rarely, as well as more definitive relationships such as dependent upon, contains, part of, type of, and instance of. These types of multi-faceted, deep, and subtle relationships are difficult to encode within a database. By structuring data within an ontology, it creates the potential to understand the interrelationships and lets the computer infer non-obvious relationships, which can allow it to analyze and learn, and suggest optimization opportunities. Although this type of AI isn’t the kind of cognitive computing that scientists eventually hope to achieve, it is obviously very useful for addressing everyday challenges, especially when those challenges involve Big Data.