کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558482 874939 2011 28 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Syntax-based reordering for statistical machine translation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Syntax-based reordering for statistical machine translation
چکیده انگلیسی

In this paper, we develop an approach called syntax-based reordering (SBR) to handling the fundamental problem of word ordering for statistical machine translation (SMT). We propose to alleviate the word order challenge including morpho-syntactical and statistical information in the context of a pre-translation reordering framework aimed at capturing short- and long-distance word distortion dependencies. We examine the proposed approach from the theoretical and experimental points of view discussing and analyzing its advantages and limitations in comparison with some of the state-of-the-art reordering methods.In the final part of the paper, we describe the results of applying the syntax-based model to translation tasks with a great need for reordering (Chinese-to-English and Arabic-to-English). The experiments are carried out on standard phrase-based and alternative N-gram-based SMT systems. We first investigate sparse training data scenarios, in which the translation and reordering models are trained on a sparse bilingual data, then scaling the method to a large training set and demonstrating that the improvement in terms of translation quality is maintained.

Research highlights
► A syntax-driven approach to handling the problem of word ordering for statistical machine translation.
► The word order challenge is alleviated including morpho-syntactical and statistical information in the context of a pre-translation reordering framework.
► The results are presented for small and large Chinese-to-English and Arabic-to-English translation tasks.
► The experiments are carried out on phrase-based and N-gram-based statistical machine translation systems.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 25, Issue 4, October 2011, Pages 761–788
نویسندگان
, ,