Article ID Journal Published Year Pages File Type
493054 Procedia Technology 2013 7 Pages PDF
Abstract

This paper presents the Part of Speech (POS) Tagger for Kadazan language using Transformation-based approach. The objectives of this study is to develop a POS tagger for Kadazan which has never been develop systematically before by any of the tagging approaches and also to solve the disambiguation problem in that language and at the same time to use it as a learning language tool. We use and implement this approach because it can achieve higher accuracy equivalent to other tagging approaches such as statistical and the original rule-based techniques and as well as having a significant advantages over the other tagging approaches. The tagging system has been trained using two Kadazan corpuses which contain 741 words and 1328 words. Based on the evaluation results, the tagging system can achieve around 92% to 93% of accuracy.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)