Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
384631 | Expert Systems with Applications | 2012 | 7 Pages |
This paper presents POST-AL, the first part-of-speech tagger for Ainu language. The system uses a hand-crafted dictionary based on Ainu narratives “yukar”. The system provides three types of information: word/token, part of speech, and translation of the token (in Japanese). Evaluation on a training set provided positive results. The system could be useful in a great number of tasks related to the research on Ainu language, such as content analysis or translation, which till now have been done mostly manually.
► We propose POST-AL, the first POS tagger for critically endangered Ainu language. ► The POS tagger functions include tokenization, POS tagging and token translation. ► We evaluate the system on 13 Ainu “yukar” stories with two versions of each function. ► Contextual POS tagging based on higher order HMM outperforms statistical approach. ► POST-AL incorporates all POS tagging standards and provides user friendly interface.