کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563164 875473 2012 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic categorization for improving Spanish into Spanish Sign Language machine translation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Automatic categorization for improving Spanish into Spanish Sign Language machine translation
چکیده انگلیسی

This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Española: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.


► A preprocessing module using an automatic categorization (of the source language) for improving the performance of a Spanish into Spanish Sign Language (LSE: Lengua de Signos Española) translation system.
► The proposed automatic categorization consists of tagging (and replacing) every Spanish word with the sign that shows the highest probability of being the translation of this word.
► This automatic categorization has been compared to a manual categorization achieving almost the same improvement.
► The proposed approach has been incorporated into two well-known statistical translation architectures: a phrase-based system (Moses) and a Statistical Finite State Transducer (SFST).
► The preprocessing module achieves an important translation error reduction: between 30% and 40% relative error reduction depending on the experiment.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 26, Issue 3, June 2012, Pages 149–167
نویسندگان
, , , , , ,