Integrating imperfect transcripts into speech recognition systems for building high-quality corpora

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
557911	874813	2012	23 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speech processing - پردازش گفتار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Integrating imperfect transcripts into speech recognition systems for building high-quality corpora

چکیده انگلیسی

The training of state-of-the-art automatic speech recognition (ASR) systems requires huge relevant training corpora. The cost of such databases is high and remains a major limitation for the development of speech-enabled applications in particular contexts (e.g. low-density languages or specialized domains). On the other hand, a large amount of data can be found in news prompts, movie subtitles or scripts, etc. The use of such data as training corpus could provide a low-cost solution to the acoustic model estimation problem. Unfortunately, prior transcripts are seldom exact with respect to the content of the speech signal, and suffer from a lack of temporal information. This paper tackles the issue of prompt-based speech corpora improvement, by addressing the problems mentioned above. We propose a method allowing to locate accurate transcript segments in speech signals and automatically correct errors or lack of transcript surrounding these segments. This method relies on a new decoding strategy where the search algorithm is driven by the imperfect transcription of the input utterances. The experiments are conducted on the French language, by using the ESTER database and a set of records (and associated prompts) from RTBF (Radio Télévision Belge Francophone). The results demonstrate the effectiveness of the proposed approach, in terms of both error correction and text-to-speech alignment.

► Training of automatic speech recognition systems.
► Synchronization of imperfect transcript on the fly to guide the ASR output with them.
► Use of the generated corpus as source and target of the system adaptation process.
► Local version of the slightly supervised algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 26, Issue 2, April 2012, Pages 67–89

نویسندگان

Benjamin Lecouteux, Georges Linarès, Stanislas Oger,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Integrating imperfect transcripts into speech recognition systems for building high-quality corpora

دسترسی سریع

ارتباط

English Website