دانلود رایگان مقاله: یکپارچه سازی داده های شمرده شمرده در مدل سازی آکوستیک مبتنی بر شبکه عصبی عمیق

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558206	1451691	2016	23 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Integrating articulatory data in deep neural network-based acoustic modeling

ترجمه فارسی عنوان

یکپارچه سازی داده های شمرده شمرده در مدل سازی آکوستیک مبتنی بر شبکه عصبی عمیق

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

DNN-HMM؛ نقشه برداری صوتی به شمرده شمرده ؛ شبکه های عصبی عمیق. مدل سازی آکوستیک؛ articulography الکترومغناطیسی؛ خودرمزگذار

Autoencoders - Carcoders Deep neural networks - شبکه های عصبی عمیق Acoustic modeling - مدل سازی آکوستیک Electromagnetic articulography - مقالات الکترومغناطیسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

یکپارچه سازی داده های شمرده شمرده در مدل سازی آکوستیک مبتنی بر شبکه عصبی عمیق

چکیده انگلیسی

• We test strategies to exploit articulatory data in DNN-HMM phone recognition.
• Autoencoder-transformed articulatory features produce the best results.
• Pre-training of phone classifier DNNs driven by acoustic-to-articulatory mapping.
• Utility of articulatory information in noisy conditions and in cross-speaker settings.

Hybrid deep neural network–hidden Markov model (DNN-HMM) systems have become the state-of-the-art in automatic speech recognition. In this paper we experiment with DNN-HMM phone recognition systems that use measured articulatory information. Deep neural networks are both used to compute phone posterior probabilities and to perform acoustic-to-articulatory mapping (AAM). The AAM processes we propose are based on deep representations of the acoustic and the articulatory domains. Such representations allow to: (i) create different pre-training configurations of the DNNs that perform AAM; (ii) perform AAM on a transformed (through DNN autoencoders) articulatory feature (AF) space that captures strong statistical dependencies between articulators. Traditionally, neural networks that approximate the AAM are used to generate AFs that are appended to the observation vector of the speech recognition system. Here we also study a novel approach (AAM-based pretraining) where a DNN performing the AAM is instead used to pretrain the DNN that computes the phone posteriors. Evaluations on both the MOCHA-TIMIT msak0 and the mngu0 datasets show that: (i) the recovered AFs reduce phone error rate (PER) in both clean and noisy speech conditions, with a maximum 10.1% relative phone error reduction in clean speech conditions obtained when autoencoder-transformed AFs are used; (ii) AAM-based pretraining could be a viable strategy to exploit the available small articulatory datasets to improve acoustic models trained on large acoustic-only datasets.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 173–195

نویسندگان

Leonardo Badino, Claudia Canevari, Luciano Fadiga, Giorgio Metta,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : یکپارچه سازی داده های شمرده شمرده در مدل سازی آکوستیک مبتنی بر شبکه عصبی عمیق

دسترسی سریع

ارتباط

English Website