Summer 2006

Analysis of a Recent TESL-L Discussion About Lexical, Semantic, and Thematic Sets

Lida Baker started a recent discussion on the TESL-L listserv, in which the subject header on a number of postings was "Semantic sets." The first response, by Brett Reynolds, declared that Keith Folse has said it is a myth that "presenting new vocabulary in semantic sets facilitates learning," giving research from Tinkham and Waring (1993) showing that "presenting new words that share a common superordinate in a set of words to learn, does interfere with learning." The original TESL-L message further suggested that, in L1 learning, "semantically related words are easier to remember than lists of unrelated words," but "that generalizing from the L1 research to L2 teaching is unwarranted." Among the semantic associations which could potentially "confuse" L2 learners of English, according to Reynolds, are:

-membership in a natural class (e.g., fruit names; addressed in the two studies and found to be confusing)
-relationship of a class member with name of the class (e.g., apple & fruit)
-similarity in spelling/pronunciation (particularly in the first few letters)
-similarity in length
-belonging to a given object (e.g., keyboard, monitor, mouse, etc. belong to a computer)
-having opposite meanings (e.g., long vs. short)
-having largely overlapping meanings (e.g., lend, loan, rent, borrow; often confused in my experience)
-belonging to the same category of words (e.g., nouns)

The next comment on TESL-L came from Keith Folse himself, who outlined the research as indicating that "students had more difficulty learning new words presented to them in semantic sets than they did learning semantically unrelated words" (Tinkham, 1997); "learners needed about 50 percent more time to learn related pairs of words than the unrelated pairs" (Waring, 1997); and "semantic clustering was detrimental to vocabulary learning [while] thematic clustering was slightly facilitative" (Tinkham, 1993).

Folse makes an important distinction between a teacher or textbook presenting a semantic set of new vocabulary words in a list, and presenting the same array of semantically related words in the context of "a story of some kind." He also clarifies the difference in research between semantically related lists of vocabulary in L1 and in L2. "In all cases, of course, native speakers can recall more words in the semantic set condition, but this is not about vocabulary learning," says Folse. "This is about a native speaker's ability to use a mnemonic device (grouping) to recall known words."

Another posting gave the citation for an article by Paul Nation in the year 2000 which outlined the arguments for not teaching vocabulary in lexical sets:

Nation, Paul. (2000). Learning vocabulary in lexical sets: dangers and guidelines. TESOL Journal, v. 9, n. 2, pp. 6-10.

And the Brett Reynolds sent a follow up message, suggesting that:

Paul has written about this in a number of places; but to the best of my knowledge he has not done any primary research addressing the topic and relies on arguments and the results of the previously mentioned articles by Tinkham and Waring.

The Folse message cited above also mentioned an interesting problem with doing primary research on the use of lexical sets for vocabulary learning. "I have tried to design a study to test this question," said Folse, "but it is almost impossible to find real English words that none of the participants know."

The last message in this thread on the TESL-L listserv, from James Jenkin, suggested that perhaps "presenting vocabulary in what seems to be an 'organised' way...rather than dealing with new vocab as it arises in context can contribute to motivation, as the learning process might 'feel' more systematic."

The Nation position referenced above is summarized as follows by James M. Ranalli, in his 2003 article, "ELT coursebooks in the age of corpus linguistics: constraints and possibilities."

Nation has noted that lexical items grouped by grammatical category may be more difficult to learn, so presenting sets of verbs, nouns or adjectives that have similar patterns--the approach taken in several Cobuild practice and reference books--may not be an effective strategy, except perhaps later after at least some of the items already have a foundation in memory. -Ranalli (2003)

As I read the exchanges on this TESL-L discussion, I was intrigued by the repetition of the phrase "semantic sets"; I could not imagine how a teacher would organize a communicative lesson without there being groups of semantically networked words. Bill VanPatten, Nick Ellis, and so many other researchers have found that the single most determinant factor in language acquisition is input: massive amounts of "comprehensible input." According to VanPatten and others, comprehensible input is written or spoken language which has a communicative purpose and is understood. How can there be a communicative purpose or communication at all if the words being used are not semantically related to each other?

Later posts, including the one by Keith Folse referenced above, make it clear that the warning concerned the use of "lexical sets," and his discussion helped me realize that vocabulary lists based on lexical categories are what it is claimed do not facilitate memory of those words for use in L2 communication.

The main reason that research about the use of lists of lexical sets in second language instruction is so relevant today is because there are gigantic corpus lists of spoken and written English which describe and define the language with more specificity than was ever possible before. The real question today is how to wisely utilize corpus technology in the context of ESL/EFL teaching and learning. The strong version of one argument which is driving some textbook design was articulated in 2000 by Andrew Sheehan, in the English Teaching Forum Online,

Perhaps the most significant figure, however, is the one corresponding to the most frequent 2000 words. The top 2000 words account for about 80% of texts. In other words, a learner who knows the most frequent 2000 words will be able to understand about 80% of a text (or, to put it another way, one in five words, or 20%, will be unknown). From this evidence, we can surmise that 2000 words is the absolute minimum a language learner needs—the survival level—in order to be able to process a text. Any fewer, and the unknown gaps in the text will be too many to enable the learner to deduce meaning from context. -Andrew Sheehan

Sheehan also lays out four major categories of lexical items, from Lewis (1993, 1996):

  • words, e.g., push, exit, fruit polywords, e.g., by the way, on the other hand

  • collocations or word partnerships, e.g., an initial reaction, to assess the situation

  • institutionalized utterances or fixed expressions, e.g., I’ll see what I can do, It’s not the sort of thing you think will ever happen to you.

  • sentence frames or heads, e.g., Considerable research has been done in recent years on the question of…; At present, however, expert opinion remains divided; Some experts believe….
    (from Lewis 1996, 10)

  • The "lexical sets" argument, as articulated earlier by Nation, Folse, Tinkham, and Waring, and as referred to in the discussion on TESL-L, can be perceived, in the light of the growth of corpus linguistics, as either supporting curriculum and instruction which uses lists based on frequency rather than lexical categories, or suggesting a limited role for any list-based vocabulary instruction, i.e., not allowing corpus technology to get ahead of or to dictate the direction of second language teaching theory.

    Article by Robb Scott
    Editor, ESL MiniConference Online

    2006 ESL MiniConference Online

