Article ID Journal Published Year Pages File Type
10369495 Computer Speech & Language 2005 17 Pages PDF
Abstract
In this paper, we describe the empirical evaluation of statistical association measures for the extraction of lexical collocations from text corpora. We argue that the results of an evaluation experiment cannot easily be generalized to a different setting. Consequently, such experiments have to be carried out under conditions that are as similar as possible to the intended use of the measures. Finally, we show how an evaluation strategy based on random samples can reduce the amount of manual annotation work significantly, making it possible to perform many more evaluation experiments under specific conditions.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,