百家乐怎么玩-澳门百家乐官网娱乐城网址_网上百家乐是不是真的_全讯网888 (中国)·官方网站

Tracking modern Chinese language with LIVAC

 

Which individuals in the Chinese speaking communities of Hong Kong, Taiwan, and Beijing have had most media exposure over the last two weeks? Which words were most frequently used? You may think these are questions to which there are no definite answers, only subjective guesses. But in fact these and other precise and statistics-based answers are only a click away in the Synchronous Linguistics Variation in Chinese Speech Communities (LIVAC) Corpus (www.rcl.cityu.edu.hk/livac/sample), developed by the Language Information Sciences Research Centre (LISRC), a CityU University Research Centre. 

The three key indices of the LISRC: "Celebrity Roster", "Place Name Rank", and "Common Word List", were compiled from the Synchronous LIVAC Corpus. First launched in 1994 by LISCR Director and Chair Professor of Linguistics and Asian Languages, Professor Benjamin T'sou , the LIVAC Corpus is one of the Competitive Earmarked Research Grants projects supported by Hong Kong's Research Grants Council.

A ten-year research project

Since July 1995, the LIVAC database has been regularly compiled with linguistic data from the major newspapers and electronic media from six Chinese-speaking communities: Hong Kong, Taiwan, Beijing, Shanghai, Macau, and Singapore. Words and phrases are first automatically selected by computer and then manually proofread and categorized. From this, a database composed of the linguistic structure-Character, phrase, sentence, and text-is constructed. This database is very useful for linguists and people interested in exploring linguistic phenomena, social organizations, culture and other developments in Chinese communities.

In early 2001, the size of the corpus exceeded 70 million characters and 400,000 phrases. It is continuously expanding. Currently, the part of the corpus database that has been put on the web comprises approximately 16 million characters and 190,000 phrases. It consists mainly of linguistic data compiled from July 1995 to June 1997. According to the LISCR schedule, the database will be expanded and renewed until June 2005. The total number of characters and phrases compiled at the end of the project is estimated to be 100 million and 600,000, respectively.

A Chinese language time capsule

"The corpus is like a time capsule, capturing the social, cultural, and linguistic developments of the six Chinese speaking communities within a decade," Professor T'sou explained, "This provides valuable primary research materials for linguists and those interested in studying Chinese societies." One of the many important objectives of the corpus is to explore in depth the dynamics in the development of modern Chinese vocabulary. This includes examining the origins and subsequent forms of new-concept words, the development of meaning in words, the transference of old phrases, and phrases with local colour.

Guess how many common Chinese translations can be found for the term "Internet" in the six targeted communities? According to LIVAC records between 1995 and 2000, there are at least 13 and the most frequently used translation varies between the different Chinese speaking communities. For instance, in Hong Kong"" (pronounced hu lian wang in Putonghua) is often used; in Taiwan, "" (wang ji wang lu); in Singapore, "" (wang ji wang luo); in Macau, ""(hu lian wang luo); and in Shanghai and Beijing, "" (yin te wang).

Professor T'sou said, "The Chinese language is diverse, not a single entity. It carries different local colour in different communities. People often criticize the Chinese written language used by young people in Hong Kong as being mingled with Cantonese colloquial expressions. This is in fact a value judgment. The same language of the same locale develops differences over the passage of time. Language never stops evolving. The corpus lets us see the developments and variations of modern Chinese language in different Chinese communities over the last 10 years."

Unlimited application potential

The process of building the database is long, laborious and tedious, similar to "cultivating a barren continent" or "moving a huge mountain", Professor T'sou said. "However, when the task is completed and the result is a 'feast' to be shared by all who are interested, we forget about the hardship and feel rewarded."

 

Apart from academic research, a database with a huge linguistic corpus, with built-in search and statistical functions, has enormous potential for application. It is increasingly common now for Hong Kong's law courts to use Cantonese, and the Synchronous LIVAC Corpus can be used in the process of recording litigation. Mobile phones designed for Chinese input also need to be supported by a huge linguistic database. In fact, as Professor T'sou pointed out, some network and IT product development companies, such as the Japanese telecom giant NTT, Hong Kong's leading web content provider, tom.com, and a subsidiary of AOL have already started applying the LIVAC database.

 

YOU MAY BE INTERESTED

Contact Information

Communications and Institutional Research Office

Back to top
百家乐官网二十一点游戏| 神州百家乐官网的玩法技巧和规则| 百家乐生活馆拖鞋| 致胜百家乐官网软件| 百家乐官网鞋| 长江百家乐的玩法技巧和规则| 德州扑克冠军| 百家乐官网是哪个国家| 百家乐路单资料| 乐天堂百家乐官网娱乐场| 百家乐官网桌布动物| 百家乐庄闲局部失衡| 大发888娱乐场漏洞| 百家乐官网什么平台好| 韩国百家乐官网的玩法技巧和规则| 百家乐园zyylc| 百家乐光纤冼牌机| 沁阳市| 白菜娱乐城| 电子百家乐官网打法| 百家乐太阳城真人游戏| 大发888开户注册首选| 百家乐官网投注科学公式| 百家乐官网正负计算| 百家乐平台注册送彩金| 威尼斯人娱乐代理注| 百家乐官网美女视频| 豪杰百家乐游戏| 威尼斯人娱乐城真人赌博| 百家乐官网台布21点| 百家乐官网永利赌场娱乐网规则| 百家乐博弈之赢者理论| 百家乐官网路单纸下载| 钱柜百家乐官网的玩法技巧和规则| sp全讯网新2| 嬴澳门百家乐官网的公式| 御金百家乐娱乐城| 百家乐官网下| 星际百家乐娱乐城| 百家乐官网心得分享| 现场百家乐能赢吗|