کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
384631 | 660852 | 2012 | 7 صفحه PDF | دانلود رایگان |

This paper presents POST-AL, the first part-of-speech tagger for Ainu language. The system uses a hand-crafted dictionary based on Ainu narratives “yukar”. The system provides three types of information: word/token, part of speech, and translation of the token (in Japanese). Evaluation on a training set provided positive results. The system could be useful in a great number of tasks related to the research on Ainu language, such as content analysis or translation, which till now have been done mostly manually.
► We propose POST-AL, the first POS tagger for critically endangered Ainu language.
► The POS tagger functions include tokenization, POS tagging and token translation.
► We evaluate the system on 13 Ainu “yukar” stories with two versions of each function.
► Contextual POS tagging based on higher order HMM outperforms statistical approach.
► POST-AL incorporates all POS tagging standards and provides user friendly interface.
Journal: Expert Systems with Applications - Volume 39, Issue 14, 15 October 2012, Pages 11576–11582