Article ID Journal Published Year Pages File Type
384631 Expert Systems with Applications 2012 7 Pages PDF
Abstract

This paper presents POST-AL, the first part-of-speech tagger for Ainu language. The system uses a hand-crafted dictionary based on Ainu narratives “yukar”. The system provides three types of information: word/token, part of speech, and translation of the token (in Japanese). Evaluation on a training set provided positive results. The system could be useful in a great number of tasks related to the research on Ainu language, such as content analysis or translation, which till now have been done mostly manually.

► We propose POST-AL, the first POS tagger for critically endangered Ainu language. ► The POS tagger functions include tokenization, POS tagging and token translation. ► We evaluate the system on 13 Ainu “yukar” stories with two versions of each function. ► Contextual POS tagging based on higher order HMM outperforms statistical approach. ► POST-AL incorporates all POS tagging standards and provides user friendly interface.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,