The Semantic Web is One Step Closer

Stephen DeAngelis

July 17, 2007

I first wrote about ongoing attempts to develop a semantic web last November [Web 3.0]. In February, I provided a brief update on progress towards creating the semantic web [Baby Steps for the Semantic Web]. The semantic web was again the subject of a post in March [Web 3.0 Still Advancing — Even if People Don’t Know What to Call It]. Recently, an article in BusinessWeek, written by Heather Green, discusses what Radar Networks, a software startup, is doing to advance the semantic web [“A Web That Thinks Like You,” 9 & 16 July 2007]. Radar Networks is headed by Nova Spivack, grandson of management guru Peter F. Drucker.

“His San Francisco startup, Radar Networks, is one of several outfits launching products that use ‘Semantic Web’ technology to let computers understand the nuances and relationships in information they encounter—in a way, say, that a human knows the difference between a baseball batter and cake batter. Later this summer, Radar will launch a private test of its service. A public version should follow in the fall and is certain to draw close scrutiny from Web techies. The company isn’t revealing many details, but it does confirm that the service uses Semantic technologies to help individuals and communities mine and share information from Internet sites, blogs, and social media services, such as YouTube.”

The “semantic technologies” discussed by Green include artificial intelligence.

“Built-in artificial intelligence will continually learn as people use the service and computers troll for similar information. Says Spivack: ‘We want to use networks to make people smarter.’ The Semantic Web has long been a sort of Holy Grail for Internet scientists. The problem with the way the Web works today is that for all its automation, the actual information that’s being sorted in a typical search is structured so that it is most understandable to people, not machines. Semantic systems use a standard format to classify all the Web’s information, whether it be airline flight tables and passenger data, biographies, or drug lab results, so that it can be read by computers. And unlike today’s search engines, Semantic Web technologies are designed not simply to look up information but to understand its meaning.”

In other words, Spivack wants to make people smarter by making the search tools they use smarter. As I’ve noted in my earlier posts, a full-blown semantic web probably won’t be available for about a decade because serious challenges continue to face developers. Green writes:

“It’s a daunting challenge, and one that’s being pursued by such big companies as IBM, Google, and Oracle, as well as tiny startups. Experts predict that in the next 5 to 10 years, companies will use the Semantic Web to build smarter search engines, automate everyday Web tasks such as comparison shopping, and identify connections between information stored in far-flung corporate databases. In 2004 a consortium led by Web architect Tim Berners-Lee set Semantic Web standards, such as how to tag information.”

Green reports that although a full-fledged semantic web may be a ways off, companies are already using semantic technologies to help them make sense of their own data bases.

“A number of large organizations, including Citigroup and Eastman Kodak Co., have already started using Semantic technology to make better sense of their own data. Citigroup is experimenting with using it to help analysts, traders, and bankers sift through data to pinpoint trends. And the explosion of personal photos, videos, and blog posts that people identify with little tags on services such as Flickr or YouTube provides a trove of data that startups such as Radar and Metaweb Technologies Inc. can use to jump-start their databases.”

Enterra Solutions is also pursuing technologies that use semantic technologies (semantic graphs) to help it assist others in “connecting-the-dots” in their various enterprises. Green concludes her article with a little more information about Spivack’s work.

“How quickly the future unfolds depends on whether people like Spivack can create a workable classification system. He has been thinking about it since the summer after his sophomore year at Oberlin College when he became fascinated with the notion of the universe as a vast computing system. That dovetailed with research Spivack was doing on the mind and discussions he was having with his grandfather about organizations and why communities work or fail. Spivack says Drucker, who introduced the idea of decentralized organizations in the 1940s and viewed them as human communities, never really understood the Internet. But his ideas about organizations were timeless, Spivack felt, and he applied them to his own work. Drucker died in 2005. A more practical lesson came courtesy of the dot-com crash. In 1994, Spivack co-founded a community Web site for tech developers called EarthWeb. The company went public at the height of the Internet frenzy in 1999, and Spivack’s net worth approached $20 million. But EarthWeb eventually was delisted and taken over by its banks. Still, the few million dollars in stock options that Spivack says he managed to cash out came in handy when he started working on Radar in 2001. Today it has 20 employees and is funded by investors that include Microsoft co-founder Paul Allen. Radar’s launch will no doubt fuel the debate among experts about how much structure the Web can handle. Some argue that computers still need a lot of help understanding what people mean by the words they choose. Projects that try to infer too much or don’t look to humans for help are doomed, says Clay Shirky, a consultant and adjunct professor in New York University’s graduate Interactive Telecommunications Program. He says: ‘How much structure can you overlay without everything getting brittle and breaking? Not much, I would argue.'”

It’s a bit ironic that Shirky sees brittleness where others see flexibility. The reason for this apparent paradox is that people are debating structure rather than standards. The web’s flexibility relies on standards even as it eschews structure. Shirky is correct that if a standards-based system is replaced by a structure-based system, the web will become brittle and eventually break (sooner than later I would predict). I also predict that standards will eventually win the day, helping maintain the web’s flexibility.