کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563517 875501 2007 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving statistical machine translation using shallow linguistic knowledge
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Improving statistical machine translation using shallow linguistic knowledge
چکیده انگلیسی
We describe methods for improving the performance of statistical machine translation (SMT) between four linguistically different languages, i.e., Chinese, English, Japanese, and Korean by using morphosyntactic knowledge. For the purpose of reducing the translation ambiguities and generating grammatically correct and fluent translation output, we address the use of shallow linguistic knowledge, that is: (1) enriching a word with its morphosyntactic features, (2) obtaining shallow linguistically-motivated phrase pairs, (3) iteratively refining word alignment using filtered phrase pairs, and (4) building a language model from morphosyntactically enriched words. Previous studies reported that the introduction of syntactic features into SMT models resulted in only a slight improvement in performance in spite of the heavy computational expense, however, this study demonstrates the effectiveness of morphosyntactic features, when reliable, discriminative features are used. Our experimental results show that word representations that incorporate morphosyntactic features significantly improve the performance of the translation model and language model. Moreover, we show that refining the word alignment using fine-grained phrase pairs is effective in improving system performance.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 21, Issue 2, April 2007, Pages 350-372
نویسندگان
, , ,