کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558411 874924 2013 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Batch-mode semi-supervised active learning for statistical machine translation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Batch-mode semi-supervised active learning for statistical machine translation
چکیده انگلیسی

The development of high-performance statistical machine translation (SMT) systems is contingent on the availability of substantial, in-domain parallel training corpora. The latter, however, are expensive to produce due to the labor-intensive nature of manual translation. We propose to alleviate this problem with a novel, semi-supervised, batch-mode active learning strategy that attempts to maximize in-domain coverage by selecting sentences, which represent a balance between domain match, translation difficulty, and batch diversity. Simulation experiments on an English-to-Pashto translation task show that the proposed strategy not only outperforms the random selection baseline, but also traditional active selection techniques based on dissimilarity to existing training data.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 27, Issue 2, February 2013, Pages 397–406
نویسندگان
, , , ,