"I have been interested in India ever since, as a student at Wharton, I organized our MBA program's first trek to the country," Cindy Deng remarked.
The MBA student from the Wharton School of the University of Pennsylvania, who has focused her Master's thesis on the future of India's economic development, has set out to study the level of interest and need for Indic (Indo-Iranian family of languages) language content on the internet.
"It was during this research that I first became aware of how fractured the language landscape of India was. Now, years later, with improved internet penetration, advances in processing scripts, the rise of Indian language blogs, etc it became apparent to me that Indic languages could be on the verge of bursting onto the world wide web," said Deng, who is now conducting a survey on the 'Usage of Local Regional Indian Languages'.
She aims to determine the level of interest and need for Indic language content on the internet with the survey.
Due to the global nature of the medium, it becomes exceedingly important for creation of language content on the internet.
According to internetworldstats.com, between 2000 and 2008, Arabic use on the web grew over 2000 per cent, Chinese use surged 755 per cent and Portuguese 668 per cent.
As of June 30, 2008, English, surprisingly, staked claim to just 31 per cent of the web, while Chinese followed close with 20 per cent. The rest of 49 per cent was claimed by languages like Spanish (7 pc), Arabic (5 pc), Japanese (2 pc), French (6 pc), Portuguese (4 pc), Korean (1 pc), Italian (1 pc), German (1 pc) followed by the rest of the languages (22 pc).
Distressingly, this shows that Indian languages stand alongside hundreds of other languages in the 'rest of languages' category. Although one could argue that the scenario has changed over the past two years, the reality on Indian language content is grim.
"While certain cultures have significant online content in their native languages, ie Japanese, Chinese, etc, India as a culture does not. This survey aims to identify the cause for this, whether it be the lack of need due to widespread English use, the lack of interest among Indians for Indic content, or something else," Deng observed.
According to a 2010 research jointly conducted by the Internet and Mobile Association of India (IAMAI) and IMRB, only 25 per cent of the Internet users aware of vernacular content online
The research, 'Report on vernacular Content: 2010', established that out of 13 million active internet users aware of online vernacular content, 9.8 million are aware of regional language content on emails and only 5.8 million use it.
The report also showed that about 6.6 million are expected to access Internet through vernacular content. Similarly, about 6.3 million are aware of search engines in local languages but only 3.1 million use it.
This indicates that even though a large number of people are aware of language content on the web, a substantially smaller fraction uses them.
The survey "is an exploratory step in what will hopefully be a new business enterprise that will help to stimulate both the creation and dissemination of Indic language content."
Explaining the ultimate goal of her study, Deng stated, "My research has also indicated that reliable translation programs are still far from ready, in part due to the lack of a large enough corpus of Indic language material online. My project, if successful, can help to overcome this problem by adding volumes of material to the existing corpus."
The results from the survey, so far, has been encouraging for Cindy Deng as her surmise that there is a huge demand for Indian language content on the web seems to be gaining strength.
"While Indic language content on the web is still extremely limited relative to the number of Indic language speakers, I believe that there is a huge demand for it, and the results of this survey, thus far, confirm that hypothesis.
"With over 2,000 unique respondents thus far and more results coming in daily, Indians' preference for Indic language content trumps English in every major category of web content."
*Have your say, take the survey.