WordHacker
Preface |
||||||||||
|
||||||||||
![]() |
In short, a powerful vocabulary will guarantee your success in life!
So a strong command of English vocabulary is critical to your professional success. A broad and precise vocabulary, will enable you to communicate at the high level needed to succeed in today's competitive environment.
The result: you expand
your English vocabulary and greatly improve your communication with a
focus on your professional success.
Although the language makes use of a large number of words, not all of these words are equally useful. One measure of usefulness is word frequency, that is, how often the word occurs in normal use of the language. From the point of view of frequency, the word the is a very useful word in English. It occurs so frequently that about 7% of the words on a page of written English and the same proportion of the words in a conversation are repetitions of the word the. Look back over this paragraph and you will find an occurrence of the in almost every line.
The good news for second language learners and second language teachers is that a small number of the words of English occur very frequently and if a learner knows these words, that learner will know a very large proportion of the running words in a written or spoken text. Most of these words are content words and knowing enough of them allows a good degree of comprehension of a text. Here are some figures showing what proportion of a text is covered by certain numbers of high frequency words.
Vocabulary size and text coverage in the Brown corpus
| Vocabulary
size |
Text
coverage |
Graph |
| 1000 2000 3000 4000 5000 6000 15,851 |
72.0% 79.7% 84.0% 86.8% 88.7% 89.9% 97.8% |
![]() |
The figures refer to written texts and are from Francis and Kucera (1982) which is a very diverse corpus of over 1,000,000 running words made up of 500 texts of around 2000 running words long. As we shall see the more diverse the texts in a corpus, the greater the number of different words and the high frequency words cover slightly less of the text, so these figures are a conservative estimate. The figures in the last line of the table are from Kucera (1982). The COBUILD Dictionary claims that 15,000 words cover 95% of the running words of their corpus. The figures in Table 1 are for lemmas and not word families. Word families would give fractionally higher coverage. Table 1 assumes that high frequency words are known before lower frequency words and shows that knowing about 2,000 word families gives near to 80% coverage of written text. The same number of words gives greater coverage of informal spoken text - around 96% (Schonell, Meddleton and Shaw, 1956).
With a vocabulary
size of 2,000 words, a learner knows 80% of the words in a text which
means that 1 word in every 5 (approximately 2 words in every line) are
unknown. Research by Liu Na and Nation (1985) has shown that this ratio
of unknown to known words is not sufficient to allow reasonably successful
guessing of the meaning of the unknown words. At least 95% coverage is
needed for that. Research by Laufer (1989) suggests that 95% coverage
is sufficient to allow reasonable comprehension of a text. A larger vocabulary
size is clearly better.
What is it to learn a new word? Minimally we must recognise it as a word and enter it into our mental lexicon. But there are several lexicons specialised for different channels of Input/Output (I/O). To understand speech, the auditory input lexicon must categorise a novel sound pattern (which will be variable across speakers, dialects, etc.); to read the word, the visual input lexicon must learn to recognise a new orthographic pattern (or, in an alphabetic language, learn to exploit grapheme- phoneme correspondences in order to access the phonology and hence match the word in the auditory input lexicon); to say the word, the speech output lexicon must tune a motor programme for its pronunciation; and to write it, the spelling output lexicon must have a specification for its orthographic sequence. We must learn its syntactic properties. We must learn its place in lexical structure: its relations with other words. We must learn its semantic properties, its referential properties, and its roles in determining entailments. We must learn the conceptual underpinnings that determine its place in our entire conceptual system. Finally we must learn the mapping of these I/O specifications to the semantic and conceptual meanings.