کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566681 1452021 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
General hybrid framework for uncertainty-decoding-based automatic speech recognition systems
ترجمه فارسی عنوان
چارچوب ترکیبی عمومی برای سیستم های تشخیص گفتار خودکار مبتنی بر عدم قطعیت رمزگشایی
کلمات کلیدی
تشخیص گفتار سر و صدای قدرتمند ؛ عدم قطعیت رمزگشایی؛ انتشار عدم قطعیت. گفتار
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• Utilizing feature uncertainties during decoding can improve the ASR noise robustness.
• The uncertainty estimation in the recognition and pre-processing domain is compared.
• The distribution of the recognition-domain residual noise is often not zero-mean.
• The propagated pre-processing uncertainties may suffer from approximation errors.
• Our hybrid approach mitigates these problems and achieves significant improvements.

Uncertainty decoding has recently been successful in improving automatic speech recognition performance in noisy environments by considering the pre-processed feature vectors not as deterministic but rather as random variables containing estimation errors, residual noise and also artifacts introduced by the signal pre-processors themselves. However, the achievable improvements depend strongly on how well the statistics of these random variables are estimated in the recognition domain. In this paper, we compare two approaches for estimating these statistics. The first approach directly estimates the needed statistics in the recognition domain. The second one estimates the statistics in the processing domain and then propagates them through the typically nonlinear feature extraction to obtain the corresponding statistics in the recognition domain. Based on this distinction, we propose a new hybrid approach that combines the advantages of both approaches and avoids their disadvantages. The new hybrid approach can be used with any speech pre-processor, which enables wider usage of the uncertainty decoding approach instead of the conventional maximum likelihood approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 79, May 2016, Pages 1–13
نویسندگان
, ,