Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1110943 | Procedia - Social and Behavioral Sciences | 2015 | 8 Pages |
Abstract
Corpus-based dialectology of less-resourced and functionally limited native languages is a developing field of linguistics. In this paper we discuss challenges of annotating dialect corpora for Turkic languages of Russia by the example of Mishar dialect of Tatar language. Peculiarities of grammatical variability in Mishar dialect are investigated from the point of view of automatic annotation and the search functionality of the corpus is described. The proposed methodology of annotation can be used when creating multilingual integrated resources and parallel corpora of closely related languages.
Related Topics
Social Sciences and Humanities
Arts and Humanities
Arts and Humanities (General)