The Multilingual Web

Stephen DeAngelis

January 14, 2009

For English speakers, the World Wide Web has been a godsend because its content has mostly been created and disseminated in English. As globalization penetrates ever deeper into developing countries, that will change [“Writing the Web’s Future in Numerous Languages,” by Daniel Sorid, New York Times, 30 December 2008]. Sorid writes:

“The next chapter of the World Wide Web will not be written in English alone. Asia already has twice as many Internet users as North America, and by 2012 it will have three times as many. Already, more than half of the search queries on Google come from outside the United States.”

Sorid’s article has a great graphic associated with it. It shows current Internet users and projected Internet users in 2012. The graphic shows that the West does not expect to increase much over the next five years and that most of the growth will take place in the developing world. This pattern of western saturation and developing world growth is mirrored in most sectors of the global economy. It’s also why I see so many opportunities in the developing world.

“The globalization of the Web has inspired entrepreneurs like Ram Prakash Hanumanthappa, an engineer from outside Bangalore, India. Mr. Ram Prakash learned English as a teenager, but he still prefers to express himself to friends and family members in his native Kannada. But using Kannada on the Web involves computer keyboard maps that even Mr. Ram Prakash finds challenging to learn. So in 2006 he developed Quillpad, an online service for typing in 10 South Asian languages. Users spell out words of local languages phonetically in Roman letters, and Quillpad’s predictive engine converts them into local-language script. Bloggers and authors rave about the service, which has attracted interest from the cellphone maker Nokia and the attention of Google Inc., which has since introduced its own transliteration tool. Mr. Ram Prakash said Western technology companies have misunderstood the linguistic landscape of India, where English is spoken proficiently by only about a tenth of the population and even many college-educated Indians prefer the contours of their native tongues for everyday speech. ‘You’ve got to give them an opportunity to express themselves correctly, rather than make a fool out of themselves and forcing them to use English,’ he said.”

As you can imagine, when a written language is expressed using hundreds or thousands of characters, manufacturing a computer keyboard that can be easily manipulated by users is a daunting challenge. Yet Sorid is correct in predicting that the future of the Web will be written in many languages, not just English or other languages that utilize Roman letters. When Enterra Solutions established a Web site for selling Iraqi goods and services, we had to ensure that site was available in Arabic, Kurdish and English. Sorid reports that those looking for full-time, long-term employment might consider learning a new language and start creating Web content.

“American technology giants are spending hundreds of millions of dollars each year to build and develop foreign-language Web sites and services — before local companies like Quillpad beat them to the punch and the profits.”

Sorid’s article, however, focuses on India, whose multilingual environment is more complex than most.

“Nowhere are the obstacles, or the potential rewards, more apparent than in India, whose online population [JupiterResearch, an online research company based in New York,] says is poised to become the third-largest in the world after China and the United States by 2012. Indians may speak one language to their boss, another to their spouse and a third to a parent. In casual speech, words can be drawn from a grab bag of tongues. In the last two years, Yahoo and Google have introduced more than a dozen services to encourage India’s Web users to search, blog, chat and learn in their mother tongues. Microsoft has built its Windows Live bundle of online consumer services in seven Indian languages. Facebook has enlisted hundreds of volunteers to translate its social networking site into Hindi and other regional languages, and Wikipedia now has more entries in Indian local languages than in Korean.”

Sorid notes that Google learned a valuable lesson in China when its search service was surpassed by local competition. The company doesn’t intend to let that happen in India.

“Google’s initiatives in India are aimed at opening the country’s historically slow-growing personal computer market, and at developing expertise that Google will be able to apply to building services for emerging markets worldwide. ‘India is a microcosm of the world,” said Dr. Prasad Bhaarat Ram, Google India’s head of research and development. ‘Having 22 languages creates a new level of complexity in which you can’t take the same approach that you would if you had one predominant language and applied it 22 times.’ Global businesses are spending hundreds of millions of dollars a year working their way down a list of languages into which to translate their Web sites, said Donald A. DePalma, the chief research officer of Common Sense Advisory, a consulting business in Lowell, Mass., that specializes in localizing Web sites. India — with relatively undeveloped e-commerce and online advertising markets — is actually lower on the list than Russia, Brazil and South Korea, Mr. DePalma said. Mr. Ram of Google acknowledged that the company’s local-language initiatives in India did not yet generate significant revenue.”

India is not alone in having an underdeveloped or non-existent e-commerce sector. That was the situation Enterra Solutions found when it started doing business in Iraq. Building an e-commerce sector from the ground up is difficult in places where Internet usage is low and electronic transactions remain little understood. Nevertheless, building up an e-commerce sector is critical for connecting local economies to the global economy. Just as important as connecting to the global economy is connecting to local consumers — that is where creating sites using local languages is critical. India is a good example.

“English simply will not suffice for connecting with India’s growing online market, a lesson already learned by Western television producers and consumer products makers, said Rama Bijapurkar, a marketing consultant and the author of Winning in the Indian Market: Understanding the Transformation of Consumer India. ‘If you want to reach a billion people, or even half a billion people, and you want to bond with them, then you have no choice but to do multiple languages,’ she said. Even among the largely English-speaking base of around 50 million Web users in India today, nearly three-quarters prefer to read in a local language, according to a survey by JuxtConsult, an Indian market research company. Many cannot find the content they are seeking. ‘There is a huge shortage of local language content,’ said Sanjay Tiwari, the chief executive of JuxtConsult.”

Most developing countries are desperate to create jobs for their burgeoning populations. The e-commerce sector is one area where such jobs could be created. Sorid provides an example of what Microsoft is doing in this area in India.

“A Microsoft initiative, Project Bhasha, coordinates the efforts of Indian academics, local businesses and solo software developers to expand computing in regional languages. The project’s Web site, which counts thousands of registered members, refers to language as ‘one of the main contributors to the digital divide’ in India. The company is also seeing growing demand from Indian government agencies and companies creating online public services in local languages. ‘As many of these companies want to push their services into rural India or tier-two towns or smaller towns, then it becomes essential they communicate with their customers in the local language,’ said Pradeep Parappil, a Microsoft program manager. … ‘Localization is the key to success in countries like India,’ said Gopal Krishna, who oversees consumer services at Yahoo India. Google recently introduced news aggregation sites in Hindi and three major South Indian languages, and a transliteration tool for writing in five Indian languages. Its search engine operates in nine Indian languages, and can translate search results from the English Web into Hindi and back. Google engineers are also plugging away on voice recognition, translation, transliteration and digital text reading that it plans to apply to other developing countries.”

It should come as no surprise that globalization rapidly becomes localization when companies reach out to consumers. Localization, regionalization, and globalization each have a place in the large scheme of things. The Web may very well turn out to be a tool that helps save struggling languages and dialects. One thing is for certain — when people are connected their lives are changed for the better. Reaching the so-called “bottom billion” is important. Companies that learn how to supply their needs will not only have a bright future, but will become partners in improving the lives of the bottom billion. Clearly there are challenges in dealing with numerous languages on a single web site, but each challenge also represents an opportunity. Entrepreneurs seek out those opportunities — just like Ram Prakash Hanumanthappa did in creating Quillpad. Imagine the challenges and the opportunities that lie ahead as we move towards a semantic Web that involves trying to make meaningful connections between hundreds of languages.