| کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
|---|---|---|---|---|
| 5124564 | 1488143 | 2017 | 14 صفحه PDF | دانلود رایگان |
- The semantic complexity and distribution of English quantifiers is studied.
- An automata-based complexity measure for generalized quantifiers is proposed.
- Distributions are inferred from a large Wikipedia-derived corpus (WaCky corpus).
- Short (unigram) and low complexity (Aristotelian) quantifiers are more frequent.
- Semantic complexity explains 27.29% of frequency deviance.
In this paper we study if semantic complexity can influence the distribution of generalized quantifiers in a large English corpus derived from Wikipedia. We consider the minimal computational device recognizing a generalized quantifier as the core measure of its semantic complexity. We regard quantifiers that belong to three increasingly more complex classes: Aristotelian (recognizable by 2-state acyclic finite automata), counting (k+2-state finite automata), and proportional quantifiers (pushdown automata). Using regression analysis we show that semantic complexity is a statistically significant factor explaining 27.29% of frequency variation. We compare this impact to that of other known sources of complexity, both semantic (quantifier monotonicity and the comparative/superlative distinction) and superficial (e.g., the length of quantifier surface forms). In general, we observe that the more complex a quantifier, the less frequent it is.
Journal: Language Sciences - Volume 60, March 2017, Pages 80-93
