Environmentally robust ASR front-end for deep neural network acoustic models

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10368484	874786	2015	22 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Front-end - جلوی پایان Deep neural network - شبکه عصبی عمیق

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Environmentally robust ASR front-end for deep neural network acoustic models

چکیده انگلیسی

This paper examines the individual and combined impacts of various front-end approaches on the performance of deep neural network (DNN) based speech recognition systems in distant talking situations, where acoustic environmental distortion degrades the recognition performance. Training of a DNN-based acoustic model consists of generation of state alignments followed by learning the network parameters. This paper first shows that the network parameters are more sensitive to the speech quality than the alignments and thus this stage requires improvement. Then, various front-end robustness approaches to addressing this problem are categorised based on functionality. The degree to which each class of approaches impacts the performance of DNN-based acoustic models is examined experimentally. Based on the results, a front-end processing pipeline is proposed for efficiently combining different classes of approaches. Using this front-end, the combined effects of different classes of approaches are further evaluated in a single distant microphone-based meeting transcription task with both speaker independent (SI) and speaker adaptive training (SAT) set-ups. By combining multiple speech enhancement results, multiple types of features, and feature transformation, the front-end shows relative performance gains of 7.24% and 9.83% in the SI and SAT scenarios, respectively, over competitive DNN-based systems using log mel-filter bank features.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 31, Issue 1, May 2015, Pages 65-86

نویسندگان

T. Yoshioka, M.J.F. Gales,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Environmentally robust ASR front-end for deep neural network acoustic models

دسترسی سریع

ارتباط

English Website