Article ID Journal Published Year Pages File Type
490387 Procedia Computer Science 2013 9 Pages PDF
Abstract

Part-of-speech (POS) tagging is a fundamental task in natural language processing (NLP). It provides useful information for many other NLP tasks, including word sense disambiguation, text chunking, named entity recognition, syntactic parsing, semantic role labeling, and semantic parsing. In this paper, we present a new method for Vietnamese POS tagging using dual decomposition. We show how dual decomposition can be used to integrate a word-based model and a syllable-based model to yield a more powerful model for tagging Vietnamese sentences. We also describe experiments on the Viet Treebank corpus, a large annotated corpus for Vietnamese POS tagging. Experimental results show that our model using dual decomposition outperforms both word-based and syllable-based models.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)