Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
373038 | System | 2014 | 16 Pages |
This paper describes an attempt to establish a pedagogically useful list of the most frequent semantically non-transparent formulaic sequences for non-English majors in an EFL context, who need to read the textbooks of their fields in English. The list was compiled from a corpus containing 20 million running words of two hundred college textbooks across forty subject areas. In consideration of opaque formulae in widespread use, we applied a set of screening criteria when using the program Collocate and manual checking. Based on frequency, range, meaningfulness, grammatical well-formedness and semantic non-compositionality, a total of 475 opaque formulaic sequences of 2–5 words were selected and they accounted for approximately 2.08% of the running words in the corpus. The formulae identified were tested against a frequency threshold in the 120 million words of academic texts in the 450-million-token Corpus of Contemporary American English (COCA) to verify if they merit pedagogical attention. As with other wordlists, it is hoped that this phrase list may serve as a reference for EAP teaching.