کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
378800 659219 2015 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Towards accurate predictors of word quality for Machine Translation: Lessons learned on French–English and English–Spanish systems
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Towards accurate predictors of word quality for Machine Translation: Lessons learned on French–English and English–Spanish systems
چکیده انگلیسی

This paper proposes some ideas to build effective estimators, which predict the quality of words in a Machine Translation (MT) output. We propose a number of novel features of various types (system-based, lexical, syntactic and semantic) and then integrate them into the conventional (previously used) feature set, for our baseline classifier training. The classifiers are built over two different bilingual corpora: French–English (fr–en) and English–Spanish (en–es). After the experiments with all features, we deploy a “Feature Selection” strategy to filter the best performing ones. Then, a method that combines multiple “weak” classifiers to constitute a strong “composite” classifier by taking advantage of their complementarity allows us to achieve a significant improvement in terms of F-score, for both fr–en and en–es systems. Finally, we exploit word confidence scores for improving the quality estimation system at sentence level.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Data & Knowledge Engineering - Volumes 96–97, March–May 2015, Pages 32–42
نویسندگان
, , ,