کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
568465 1452017 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
ترجمه فارسی عنوان
آموزش مدل نیمه نظارت و بدون نظارت تبعیضی، زبان برای تشخیص خودکار گفتار
کلمات کلیدی
تبعیضی؛زبان مدل سازی; آموزش نیمه نظارت; آموزش بدون نظارت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We investigate supervised, semi-supervised and unsupervised training of DLMs.
• We use supervised and unsupervised confusion models to generate artificial data.
• We propose three target output selection methods for unsupervised DLM training.
• Ranking perceptron performs better than structured perceptron in most cases.
• Significant gains in ASR accuracy are obtained with unmatched acoustic and text data.

Discriminative language modeling aims to reduce the error rates by rescoring the output of an automatic speech recognition (ASR) system. Discriminative language model (DLM) training conventionally follows a supervised approach, using acoustic recordings together with their manual transcriptions (reference) as training data, and the recognition performance is improved with increasing amount of such matched data. In this study we investigate the case where matched data for DLM training is limited or is not available at all, and explore methods to improve ASR accuracy by incorporating acoustic and text data that come from separate sources. For semi-supervised training, we utilize a confusion model to generate artificial hypotheses instead of the real ASR N-bests. For unsupervised training, we propose three target output selection methods to take over the missing reference. We handle this task both as a structured prediction and a reranking problem and employ two different variants of the WER-sensitive perceptron algorithm. We show that significant improvement over baseline ASR accuracy is obtained even when there is no transcribed acoustic data available to train the DLM.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 83, October 2016, Pages 54–63
نویسندگان
, ,