کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4977825 1452011 2017 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A comparative study of noise estimation algorithms for nonlinear compensation in robust speech recognition
ترجمه فارسی عنوان
بررسی مقایسه ای الگوریتم های تخمینی نویز برای غرامت غیرخطی در تشخیص گفتار قوی
کلمات کلیدی
تجزیه و تحلیل فاکتور، روش گاوس نیوتن، جبران غیر مستقیم، ترکیب مدل موازی، شناسایی قوی سخنرانی، مجموعه ای از تیلور بردار،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی
Nonlinear compensation models make use of a nonlinear mismatch function, which characterizes the joint effects of additive and convolutional noise, to realize noise-robust speech recognition. Representative compensation models consist of vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT). The noise parameters of the compensation models, often estimated in the maximum likelihood (ML) sense, are known to play an important role on the system performance in noisy conditions. In this paper, we conduct a systematic comparison between two popular approaches for estimating the noise parameters. The first approach employs the Gauss-Newton method in a generalized EM framework to iteratively maximizing the EM auxiliary function. The second approach views the compensation models from a generative perspective, giving rise to an EM algorithm, analogous to the ML estimation for factor analysis (EM-FA). We demonstrate a close connection between these two approaches: they belong to the family of gradient-based methods except with different convergence rates. Note that the convergence property can be crucial to the noise estimation since model compensation may be frequently carried out in changing noisy environments for retaining desired performance. Furthermore, we present an in-depth discussion on the advantages and limitations of the two approaches, and illustrate how to extend these approaches to allow for adaptive training. The investigated noise estimation approaches are evaluated on several tasks. The first is to fit a GMM model to artificially corrupted samples, and then speech recognition are performed on the Aurora 2 and Aurora 4 tasks.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 89, May 2017, Pages 58-69
نویسندگان
, ,