Semantic Web and Semantic Search

Stephen DeAngelis

April 30, 2013

Sir Timothy “Tim” Berners-Lee, inventor of the World Wide Web, also coined the term “semantic web” nearly a score of years ago. It has taken most of two decades to develop technologies that hold the promise of making his vision a reality. Some people believe it may take another twenty years to perfect. Bill Kilpatrick, a student at Temple University’s Fox School of Business, offers this brief, but informative, description of the semantic web concept. [“What Is The Semantic Web?,” HASTAC, 15 November 2012] He writes:

“The technology aims to link up information, on a worldwide scale, in a way that is easily understood by machines. In essence, the Semantic Web will allow computers to process syntax closer to the way humans do by describing things in ways that computers can understand. For example, consider the following statements:

  • The Rolling Stones are a rock band.
  • Keith Richards plays guitar in the Rolling Stones.
  • ‘Brown Sugar’ was recorded by the Rolling Stones.

“Those sentences, and their mutual relationship, are easily comprehended by most people. To be understood by computers, however, they would need the ability to process syntax semantically. This process is likened to the use of hyperlinks, which connect a current web page to another one, thus defining a relationship between the two pages. However, on the Semantic Web, an important difference is that such relationships could be recognized between any two or more resources, if the information is properly structured. To do this, the Semantic Web uses special languages for detailing web-based resources and information, such as RDF (Resource Description Framework). Information put into RDF files allows computers to find, extract, store, analyze, and process web-based information, which the Semantic Web can then describe.”

Berners-Lee adds:

“When you use speech grammars and VoiceXML, you are describing possible speech conversations. When you use XML schema, you are describing documents. RDF is different. When you use RDF and OWL, you are talking about real things. … Because this information is about real things, it is much more reusable. … The general properties of a car, or a product of your company, of real things, change rarely. They are useful to many applcations. This background information is called the ontology, and OWL the language it is written in.” [“Speech and the Future,” World Wide Web Consortium, 14 September 2004]

A company called hakia, which advertises itself as “a pioneering company in semantic search technology, notes that a semantic search has to achieve at least ten things in order to be successful. [“What is Semantic Search?“] They are:

1- Handling morphological variations (like tenses, plurals, etc.)
2- Handling synonyms with correct senses (like cure, heal, treat,.. etc.)
3- Handling generalizations (like disease = GERD, ALS, AIDS, etc.)
4- Handling concept matching
5- Handling knowledge matching (like swine flu = H1N1, flu=influenza)
6- Handling natural language queries and questions (like what, where, how, why, etc.)
7- Ability to point to uninterrupted paragraph and the most relevant sentence
8- Ability to Customize and Organic Progress — Semantic search allows customization in various stages by the owners of the system as well as the user of the system (i.e., such as semantic tagging) where search becomes a part of a social network formed around a business.
9- Ability to operate without relying on statistics, user behavior, and other artificial means
10- Ability to detect its own performance — A semantic search engine is expected to produce a relevancy score that reflects the degree of meaning match. … Accordingly, the search engine can understand its poor performance to automatically flag areas of improvement that is needed.

The ability to confidently discover semantic relationships dramatically improves search results. Hakia insists, “Conventional search systems (indexing keyword search) can no longer meet the increasing demand for quality results and time-saving practices in today’s world, nor do they offer any room for progress for the future. As a result, semantic search has increasingly been the choice as the next step by many businesses during the last decade.” It continues:

“Semantic search technology is based on a computerized system that understands content and query similar to how the human brain processes natural languages. Instead of matching the occurrence of words or symbols (as done in indexing systems), semantic search systems match concepts and their meaningful variations. As a result, a number of benefits emerge:

  • Accuracy: Improves the accuracy of the search results exponentially
  • Focus: Transforms search function from pointing a document to pointing a direct answer
  • Engagement: Allows flexibility to use natural language queries, thus increases user engagement
  • Intelligence: Enables semantic understanding of the user behavior via search analytics
  • Control: Prevents manipulations from content providers and users
  • Independence: Does not rely on external inputs (i.e., popularity) for base performance
  • Progress: Allows customization, user input, organic improvement

These advantages result in indisputable return of investment that are reported in several enterprise-wide case studies.”

Eric Savitz agrees that semantic search has the potential to change the business landscape dramatically. [“5 Ways Semantic Search Will Disrupt Business,” Forbes, 20 June 2012] The reason he believes that semantic search has a promising future is because it can be used to help make sense of big data. He writes:

“As the Big Data dialogue progresses and the information onslaught grows more acute, a single technology has quietly evolved that holds the promise to put our data anxieties to rest and it’s been here all along. One could even say that it was one of the origins of the Big Data challenge – semantic search. Semantic search significantly improves search accuracy and relevance by understanding a searcher’s true intent and the contextual meaning of words. The technology considers the context of search, location, intent, word variation, synonyms, multiple meanings of words and foreign language to provide users with exactly what they are seeking. Say good-bye to the days of search results with endless pages of blue links.”

Savitz believes there “are 5 areas where semantic search will disrupt and transform business and help solve for the global data obesity challenge.” They are:

  • SEO: By 2016, the interactive advertising business will reach $77 billion, according to Forrester. … Semantic contextualization will enable better … targeted ads based on a searcher’s intent. One to one consumer/advertiser relationships will reach its full potential with semantic search. …
  • Database Management: Using semantic search to build new ways of analyzing the massive amounts of data that businesses are generating will allow them to identify new business opportunities. …
  • Drug Discovery: Using semantic search will allow streamlined access to critical information necessary for complex product development such as drug discovery. … Semantic technology can help choose better candidates by using matching technology combined with scoring and ranking systems, saving money and re-filling the global drug pipeline.
  • Travel: … The next big transformation in [the] travel industry will be booking the complete travel package. Semantic search can dramatically simplify discovering destinations and activities, reduce the complexity involved with tailoring a vacation. …
  • Human Capital Management: … Semantic search holds the promise of becoming the killer app in human capital management because its sophisticated recognition process enables finding the right needle in a million data haystacks.

The most important word in all of these discussions, according to Doc Sheldon, is “semantic.” He writes, “‘Semantic’, … is a qualifier that means a great deal in this context. It demands that a machine, or more accurately, the software that drives that machine, must understand the information in the way it was intended. Let’s face it: most of us know a handful of human beings that are challenged in that regard.” [“Semantic Search in 2025,” Search Engine Watch, 6 November 2012] He continues:

“Indeed, for a machine to comprehend the meaning behind what a human has put to text, requires a certain amount of artificial intelligence. Humor, irony, and emotion certainly seemed to be beyond the conceivable limits of a computer program in 1994. Even in 2012, there are still some that doubt that such comprehension will be possible in the near future.”

As the title of his post reveals, Sheldon believes that the semantic web will be a reality by 2025. He believes that all of the tools and data are in place to make this happen. They just need to be refined over the coming decade.