| Article ID | Journal | Published Year | Pages | File Type | 
|---|---|---|---|---|
| 1110949 | Procedia - Social and Behavioral Sciences | 2015 | 12 Pages | 
Abstract
												This paper presents a simple methodology to create corpus-based frequency lemma lists, applied to the case of the Basque language. Since the first work on the matter in 1982, the amount of text written in Basque and language resources related to this language has grown exponentially. Based on state-of-the-art Basque corpora and current NLP technology, we develop a frequency lemma list for standard Basque. Our aim is twofold: On the one hand, to propose a primary Basque lemma list for a bilingual dictionary that is currently being worked on at UPV/EHU, and on the other, to contrast existing Basque dictionary lemma lists with frequency data, in order to evaluate the adequacy of our proposal and to compare lemma lists with each other.
Related Topics
												
													Social Sciences and Humanities
													Arts and Humanities
													Arts and Humanities (General)
												
											