Unsupervised training and directed manual transcription for LVCSR

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567617	1452046	2010	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Automatic transcription Discriminative training - آموزش تبعیض آمیز Unsupervised training - آموزش غیرمتمرکز Data selection - انتخاب داده ها

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Unsupervised training and directed manual transcription for LVCSR

چکیده انگلیسی

A significant cost in obtaining acoustic training data is the generation of accurate transcriptions. When no transcription is available, unsupervised training techniques must be used. Furthermore, the use of discriminative training has become a standard feature of state-of-the-art large vocabulary continuous speech recognition (LVCSR) system. In unsupervised training, unlabelled data are recognised using a seed model and the hypotheses from the recognition system are used as transcriptions for training. In contrast to maximum likelihood training, the performance of discriminative training is more sensitive to the quality of the transcriptions. One approach to deal with this issue is data selection, where only well recognised data are selected for training. More effectively, as the key contribution of this work, an active learning technique, directed manual transcription, can be used. Here a relatively small amount of poorly recognised data is manually transcribed to supplement the automatic transcriptions. Experiments show that using the data selection approach for discriminative training yields disappointing performance improvement on the data which is mismatched to the training data type of the seed model. However, using the directed manual transcription approach can yield significant improvements in recognition accuracy on all types of data.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 52, Issues 7–8, July–August 2010, Pages 652–663

نویسندگان

Kai Yu, Mark Gales, Lan Wang, Philip C. Woodland,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Unsupervised training and directed manual transcription for LVCSR

دسترسی سریع

ارتباط

English Website