کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558218 1451691 2016 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation
ترجمه فارسی عنوان
معیار اطلاعات متقابل حداکثر تفاضلی برای سازگاری مدل صوتیقدرتمند بدون نظارت
کلمات کلیدی
معیار تمایز؛ اطلاعات متقابل تفاضلی حداکثر ؛ تشخیص گفتار؛ سازگاری مدل صوتی؛ سازگاری بدون نظارت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• The differenced-MMI (dMMI) is a discriminative criterion that generalizes MPE and BMMI.
• We discuss the behavior of dMMI when there are errors in the transcription labels.
• dMMI may be less sensitive to such errors than other criteria.
• We support our claim with unsupervised speaker adaptation experiments.
• dMMI based adaptation achieves significant gains over MLLR for 2 LVCSR tasks.

Discriminative criteria have been widely used for training acoustic models for automatic speech recognition (ASR). Many discriminative criteria have been proposed including maximum mutual information (MMI), minimum phone error (MPE), and boosted MMI (BMMI). Discriminative training is known to provide significant performance gains over conventional maximum-likelihood (ML) training. However, as discriminative criteria aim at direct minimization of the classification error, they strongly rely on having accurate reference labels. Errors in the reference labels directly affect the performance. Recently, the differenced MMI (dMMI) criterion has been proposed for generalizing conventional criteria such as BMMI and MPE. dMMI can approach BMMI or MPE if its hyper-parameters are properly set. Moreover, dMMI introduces intermediate criteria that can be interpreted as smoothed versions of BMMI or MPE. These smoothed criteria are robust to errors in the reference labels. In this paper, we demonstrate the effect of dMMI on unsupervised speaker adaptation where the reference labels are estimated from a first recognition pass and thus inevitably contain errors. In particular, we introduce dMMI-based linear regression (dMMI-LR) adaptation and demonstrate significant gains in performance compared with MLLR and BMMI-LR in two large vocabulary lecture recognition tasks.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 24–41
نویسندگان
, , , , ,