Article ID Journal Published Year Pages File Type
373262 System 2013 17 Pages PDF
Abstract

In assessments of second language (L2) writing, quality of lexis typically claims more variance than other factors, and the most readily operationalized measure of lexical quality is word frequency. This study compares two methods of automatically assessing word frequency in learner productions. The first method, a band-based method, involves lexical frequency profiling, a procedure that first groups individual words into families and then sorts these into corpus-based frequency bands. The second method, a count-based method, assigns a normalized corpus frequency count to each individual word form used, yielding an average count for a text. Both band and count-based methods were used to analyze 100 L2 learner and 30 native speaker freewrites that had been classified according to proficiency level (i.e., native speakers and beginning, intermediate and advanced L2 learners). Machine learning algorithms were used to classify the texts into their respective proficiency levels with results indicating that count-based word frequency indices accurately classified 58% of the texts while band-based indices reported accuracies that were between 10% and 22% lower than count-based indices.

Related Topics
Social Sciences and Humanities Arts and Humanities Language and Linguistics
Authors
, , ,