کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1110949 1488361 2015 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Building Corpus-based Frequency Lemma Lists
ترجمه فارسی عنوان
لومن فرکانس مبتنی بر فضایی را لیست می کند؟
موضوعات مرتبط
علوم انسانی و اجتماعی علوم انسانی و هنر هنر و علوم انسانی (عمومی)
چکیده انگلیسی

This paper presents a simple methodology to create corpus-based frequency lemma lists, applied to the case of the Basque language. Since the first work on the matter in 1982, the amount of text written in Basque and language resources related to this language has grown exponentially. Based on state-of-the-art Basque corpora and current NLP technology, we develop a frequency lemma list for standard Basque. Our aim is twofold: On the one hand, to propose a primary Basque lemma list for a bilingual dictionary that is currently being worked on at UPV/EHU, and on the other, to contrast existing Basque dictionary lemma lists with frequency data, in order to evaluate the adequacy of our proposal and to compare lemma lists with each other.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia - Social and Behavioral Sciences - Volume 198, 24 July 2015, Pages 266-277