中英對照香港學校中文學習基礎字詞
Lexical Items with English Explanations
for Fundamental Chinese Learning in Hong Kong Schools

An Introduction to the Research on Hong Kong Chinese Lexicons for Primary Learning

In this page:


Background

Chinese characters have a long history and their quantity is huge. For instance, the Hanyu Da Zidian (漢語大字典 or Dictionary of Chinese Characters), includes more than 56,000 entries. However, the characters we use in daily life are only a fraction of this huge number. From the learning perspective, children only need to learn the most frequently used characters and words. There is no need to spend time on learning characters that are more obscure or rarely used.

How many Chinese characters and words, then, are needed for daily reading and writing? In the huge pool of Chinese characters and words, which characters and words are used more frequently? For different Key Stages of learning, how many characters and which characters should be learnt? Educators have long been concerned and wished to address such issues.

Since ancient times, there has been a tradition in China of writing children’s textbooks that aim at character learning. According to the Bibliographic Treatise in the Book of Han (漢書‧藝文志), there were more than ten types of character learning books for children at that time, including Shi Zhou (史籀) and Cangjie (倉頡). Later, there appeared the Master Essay in a Thousand Characters (千字文) and Three-character Classic (三字經), which were likewise of great importance to Chinese language teaching in ancient China. However, it is now a mystery how the characters and words were chosen to suit children’s needs at that time.

Nowadays, it is recognised that effective teaching materials for character learning must be able to reflect how the language is used. Therefore, researching language usage and avoiding subjective speculation are obviously of utmost importance in the study of how to teach Chinese characters.

In China, qualitative research on Chinese characters and Chinese lexicons can be dated back to as early as the 1920s. In 1922, Li Jinxi (黎錦熙) published Statistical Studies of the Basic Lexical Items in Chinese (國語基本語詞的統計研究). Later, Chen Heqin (陳鶴琴) selected 4,261 commonly-used characters out of a 550,000-character corpus and published The Applied Character List in Vernacular Writing (語體文應用字匯). In the 1970s, computer technologies became more advanced and affordable. With the help of the computer, various large-scale research projects on Chinese characters and lexical items have been conducted, which allow us to have a relatively objective and precise understanding of the use of Chinese.

However, the scope of the commonly-used lexicons and the research results are greatly affected by the language materials selected as well as the cultural and political background of the language community. As one may notice, the statistics from Mainland China and Taiwan concerning word usage are not totally the same. Although there are quite a number of lexical statistics from Mainland China and Taiwan, those results can neither show the situation of language use in Hong Kong nor reflect the linguistic features of Hong Kong Chinese. As a result, we must produce our own statistics which can reflect the local language use in order to provide results with a comparatively high reference value for Hong Kong language education.

In 1990, the Curriculum Development Council of Hong Kong announced the Syllabus for Primary Chinese Language (小學中國語文科課程綱要), containing the List of Commonly Used Characters at Primary Level (小學常用字表) as an appendix, which listed 2,600 characters. In 1996, the Education Department of Hong Kong announced the List of Reference Vocabulary for Teaching in Primary Schools (trial version) (小學教學參考詞語表 (試用)) in which 6,765 words were listed. These materials have played a key role in promoting Chinese language teaching in Hong Kong primary schools by standardising and specifying lexical items for teaching, which in turn provided references for language teachers and textbook editors.

In recent years, there have been new changes and developments in various aspects of society, economy, culture, education, etc. in Hong Kong. Exchanges between the Mainland and Hong Kong have become significantly more frequent since the handover, and the mutual influence continuously stronger. To reflect the recent changes in language use in Hong Kong more comprehensively, and to satisfy the needs of Chinese Language teaching in Hong Kong primary schools, there has been a pressing need for research on Hong Kong Chinese words for primary learning as well as the compilation of a new lexical list.

[ ▲ Back to top ]

The Basis of Materials and Research Procedures

The Department of Chinese and Bilingual Studies at The Hong Kong Polytechnic University, has established a Corpus of Modern Chinese in Mainland China, Taiwan and Hong Kong (中國大陸、台灣、香港現代漢語語料庫). It collected language materials of 5,600,000 characters and over 60,800 words, which is the first fully-completed corpus of modern Chinese that covers language materials from the Mainland, Taiwan and Hong Kong.

In 2003, the Department of Chinese and Bilingual Studies at The Hong Kong Polytechnic University, commissioned by the Education and Manpower Bureau, developed a character list and a list of reference words for primary teaching. The statistical research mentioned above on words used in the Mainland, Taiwan and Hong Kong serves as the basis for the research on the Hong Kong Chinese lexicons for learning at primary level. To match the contents suggested by the new syllabus for the Chinese language, the research group, building upon the existing research, added the most up-to-date language materials as well as popular children’s readings and primary textbooks. The number of cultural and educational lexical items was increased while words about politics and finance and economics were reduced. The language materials were then further enlarged to 6,520,000 characters, which later became the basis of the new general lexical list. After categorisation and statistical processing, the application and distribution of each word in different types of materials could be found. All these materials then formed the basis of the organisation of these Chinese lexical lists for primary learning.

In general, the number of commonly-used words needed for an adult to meet daily needs is around 15,000. So, researchers selected 15,000 words with the highest frequencies from the general lexical lists mentioned above for further analysis. The cumulative coverage of those lexical items is 96.3%.

Apart from learning local words, Hong Kong students should also master the commonly-used lexical items in Mainland China and Taiwan. Therefore, researchers conducted comparative analysis on the aforementioned selected 15,000 words and the 15,000 highest-frequency words extracted from the latest lexical lists from Mainland China and Taiwan as follows:

To reflect learning at primary level more precisely, researchers further compared the selected 15,000 words with two lexical lists for primary learning collected from Hong Kong and Beijing; and performed a comparative analysis according to two Key Stages of learning (Primary 1 to 3 and Primary 4 to 6):

After the comparative analysis and computer processing of the raw data in various sorts of lexical lists, plus the careful review and repetitive selection, a preliminary lexical list of 9,400 words was drafted. During the selection process, researchers focused on the lexical distribution of those lists based on the objective information collected from the comparative analysis and tried to select the words that were common among the three regions across the Strait. Meanwhile, a suitable number of words indicating events and things that are special to Hong Kong, such as "八達通", "茶餐廳" and "便利店", were also selected. In addition, some words that do not have a high frequency of daily use but are very important in the primary learning context, were included as well. Examples are "試題", "校服", "塗改", "課室", "乘數", "除數" and "兒歌".

Furthermore, researchers also approached the task from a linguistics perspective. They studied the nature of word formation as well as the issue of systematisation in the collection of lexical items. For example, in general, surnames and proper nouns were not collected, excluding a few lexical items that are commonly used by Hong Kong students, such as "中國" and "香港". To give another example, lexical items which can infinitely produce other words by analogy, like "三哥", "四哥", "五哥" and "初三", "初四", "初五", were also not listed generally.

Moreover, some highly productive affixes, such as "者", "室" and "商", though they cannot stand alone, can easily produce other words by analogy and form word groups like "示威者"), "流浪者", "陳列室", "閱覽室", "經銷商" and "出版商", were listed to avoid an excessive collection of those derivative words. In addition, considering the phenomenon that a substantial quantity of monosyllabic words are found in Hong Kong Chinese, some stand-alone words that are frequently used, such as "餐", "入", "幫" and "竹", were collected as well.

On the other hand, to ensure the economy and conciseness of this lexical list, where words may appear in two different forms, like "爸, 爸爸", "媽, 媽媽" and "但, 但是", only one of the forms was collected. The suffix "子" is optional in some words and only entries with that suffix were collected. For instance, "椅子" is listed instead of "椅". No separate entries were given to homonyms and only some supplementary explanation was provided to illustrate the usage. For example, the character "花" used in "花朵" (meaning "flower") and "花費" (meaning "to spend") would be found under the same entry instead of two separate entries.

When the research was approaching its end, the draft of the Word List for Putonghua Teaching and Testing in Hong Kong, compiled by The Hong Kong Polytechnic University, was under internal review. To adjust the new lexical list to the Putonghua teaching situation in Hong Kong, another comparative analysis was carried out between the new list and the Standard of Graded Putonghua Lexicons (普通話詞語等級大綱), and some adjustments were made.

After confirming a preliminary scope for the lexical list, researchers divided the list into two Key Stages of learning (Primary 1 to 3 and Primary 4 to 6) according to the Primary Chinese Language curriculum of Hong Kong. The basis of dividing the stages rested on the data from different word frequency lists and the Key Stages of learning in the two word lists for primary learning made by Hong Kong and Beijing. In addition, experienced language teachers of primary schools formed a focus group to provide professional advice, and judgments were made only after repeated consideration. As a result, the current situation of language teaching in primary schools in Hong Kong could be fully taken care of.

Some special lexical items that primary students may come across in learning were not selected according to frequency, due to the constraints of the statistics of language materials. Those lexical items include idioms, word groups of four characters with a rather fixed combination, idiomatic phrases, proper nouns, terms, words in classical Chinese and transliterated loan words. They either have special meanings or colloquial characteristics, or else cultural contents and figurative significance, which are very important in primary learning. They were therefore included in the supplementary lists to serve as additional information.

Finally, there are characters that are commonly used in surnames and proper nouns in Hong Kong, such as "劉", "吳" and "趙", which are now solely used in surnames, and "娟", "曼", "琼" and "埗", "涌", "磡", "鱲", "輋", "砵", "滘", "笪", which are mainly used in names of people and places. Although they do not have a high frequency in the statistics, Hong Kong students would always see them in daily life; and thus they were included in the supplementary lists as well.

[ ▲ Back to top ]

Research Principles

Concluding the research procedures above, researchers focused mainly on the factors in three areas when confirming the scope of the lexical list: 1. usage and distribution of vocabulary in language materials; 2. characteristics of lexical items used in Hong Kong and; 3. real situations of primary education and children’s life.

The research mainly followed the principles below:

[ ▲ Back to top ]

Research Results

With reference to the principles above, it was decided to reach a collection of 9,706 words in the final lexical list, where the majority of words are common in the three regions across the Strait and a small portion are words solely found in Hong Kong. There are 4,914 words and 4,792 words in Key Stage I (Primary 1 to 3) and Key Stage II (Primary 4 to 6) respectively. The distribution of the two Key Stages of learning is close to the two word lists made by Hong Kong and Beijing. There are 3,171 characters in the lexical lists, excluding those in the supplementary lists. The results reflect the objective situation of language use in Hong Kong. As the coverage of various types of reference word lists is wide and the selection process has followed certain criteria, and the source of every entry of the list could be traced, the total number of lexical items is thus slightly higher than that of the List of Lexical Items for Teaching Reference in Primary Schools (1996), which should be appropriate in terms of quantity.

5,860 words of the word list are from the List of Lexical Items for Teaching Reference in Primary Schools (1996), amounting to 86.62% of the list. 2,461 Chinese characters are from List of Lexical Items for Teaching Reference in Primary Schools (1996) amounting to 97.97% of Chinese characters on the list; while 2,566 characters are from the List of Commonly Used Characters at Primary Level (1990), amounting to 98.69% of Chinese characters in the list. Researchers believe that the proportion of old and new lexical items basically fulfils the expectation of language educators in primary schools. It not only shows that the old and new word lists are along the same lines, but also reflects fully the language development and changes in Hong Kong in recent decades.

[ ▲ Back to top ]

Acknowledgements

The completion of the "Research on Hong Kong Chinese Lexicons for Primary Learning" has relied heavily on the concerted efforts of experts, scholars, experienced teachers and professionals from relevant organisations. During the research, I was deeply touched by the efforts of Prof CHOW Kwok-ching of Hong Kong Baptist University and Mr LEE Siu-tat, Mr CHENG Man-leung, Ms LAI Sau-mei, Ms SUN San-tak, Ms AUYEUNG Oi-ling, Ms Alice SZE Chi-wing, Ms Alice CHAN Suk-ping of the Education Bureau of HKSAR, all of whom participated in reviewing the materials, and by Ms TANG Mei-lan, Ms MOU Suet-fong, Ms TAM Kit-wai, all experienced language teachers who participated in the teachers’ focus group.

We hereby express our sincere gratitude to the experts and consultants from Mainland China for their guidance and assistance, including Prof Fu Yonghe, Prof Chen Zhangtai, Prof Li Xingjian, Prof Tong Lequan and Prof Liu Yinglin. We also express a special thank you to the State Language Commission (國家語言文字工作委員會) and Taiwan’s Ministry of Education for their generous approval of the use of their lexical statistics.


CHAN Shui-duen
Department of Chinese & Bilingual Studies
The Hong Kong Polytechnic University

May 2007

遵守2A級無障礙圖示,萬維網聯盟(W3C)- 無障礙網頁倡議(WAI)Level Double-A conformance, W3C WAI Web Content Accessibility Guidelines 2.0

[ ▲ Back to top ]

打開/關閉搜尋窗格‧Collaspse/Expand the Search Pane