کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567033 1452042 2014 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
SMT-based ASR domain adaptation methods for under-resourced languages: Application to Romanian
چکیده انگلیسی


• The first large-vocabulary Romanian ASR system is presented.
• Phonetization and diacritics restoration systems for Romanian are introduced.
• An innovative ASR domain-adaptation methodology based on SMT is proposed.
• The semi-supervised adaptation methods are shown to improve ASR performance.

This study investigates the possibility of using statistical machine translation to create domain-specific language resources. We propose a methodology that aims to create a domain-specific automatic speech recognition (ASR) system for a low-resourced language when in-domain text corpora are available only in a high-resourced language. Several translation scenarios (both unsupervised and semi-supervised) are used to obtain domain-specific textual data. Moreover this paper shows that a small amount of manually post-edited text is enough to develop other natural language processing systems that, in turn, can be used to automatically improve the machine translated text, leading to a significant boost in ASR performance. An in-depth analysis, to explain why and how the machine translated text improves the performance of the domain-specific ASR, is also made at the end of this paper. As bi-products of this core domain-adaptation methodology, this paper also presents the first large vocabulary continuous speech recognition system for Romanian, and introduces a diacritics restoration module to process the Romanian text corpora, as well as an automatic phonetization module needed to extend the Romanian pronunciation dictionary.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 56, January 2014, Pages 195–212
نویسندگان
, , , ,