Article ID Journal Published Year Pages File Type
373038 System 2014 16 Pages PDF
Abstract

This paper describes an attempt to establish a pedagogically useful list of the most frequent semantically non-transparent formulaic sequences for non-English majors in an EFL context, who need to read the textbooks of their fields in English. The list was compiled from a corpus containing 20 million running words of two hundred college textbooks across forty subject areas. In consideration of opaque formulae in widespread use, we applied a set of screening criteria when using the program Collocate and manual checking. Based on frequency, range, meaningfulness, grammatical well-formedness and semantic non-compositionality, a total of 475 opaque formulaic sequences of 2–5 words were selected and they accounted for approximately 2.08% of the running words in the corpus. The formulae identified were tested against a frequency threshold in the 120 million words of academic texts in the 450-million-token Corpus of Contemporary American English (COCA) to verify if they merit pedagogical attention. As with other wordlists, it is hoped that this phrase list may serve as a reference for EAP teaching.

Related Topics
Social Sciences and Humanities Arts and Humanities Language and Linguistics
Authors
,