دانلود رایگان مقاله: بهبود تشخیص اشتباه با استفاده از شبکه عمیق عصبی مدل های صوتی آموزش داده شده و طبقه بندی های رگرسیون لجستیک مبتنی بر یادگیری

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6961164	1452033	2015	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

ترجمه فارسی عنوان

بهبود تشخیص اشتباه با استفاده از شبکه عمیق عصبی مدل های صوتی آموزش داده شده و طبقه بندی های رگرسیون لجستیک مبتنی بر یادگیری

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

یادگیری کامپیوتری، تشخیص اشتباه، شبکه عصبی عمیق رگرسیون لجستیک، انتقال یادگیری،

Mispronunciation detection Transfer learning - انتقال یادگیری Logistic regression - رگرسیون لوجستیک Deep neural network - شبکه عصبی عمیق

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

بهبود تشخیص اشتباه با استفاده از شبکه عمیق عصبی مدل های صوتی آموزش داده شده و طبقه بندی های رگرسیون لجستیک مبتنی بر یادگیری

چکیده انگلیسی

Mispronunciation detection is an important part in a Computer-Aided Language Learning (CALL) system. By automatically pointing out where mispronunciations occur in an utterance, a language learner can receive informative and to-the-point feedbacks. In this paper, we improve mispronunciation detection performance with a Deep Neural Network (DNN) trained acoustic model and transfer learning based Logistic Regression (LR) classifiers. The acoustic model trained by the conventional GMM-HMM based approach is refined by the DNN training with enhanced discrimination. The corresponding Goodness Of Pronunciation (GOP) scores are revised to evaluate pronunciation quality of non-native language learners robustly. A Neural Network (NN) based, Logistic Regression (LR) classifier, where a general neural network with shared hidden layers for extracting useful speech features is pre-trained firstly with pooled, training data in the sense of transfer learning, and then phone-dependent, 2-class logistic regression classifiers are trained as phone specific output layer nodes, is proposed to mispronunciation detection. The new LR classifier streamlines training multiple individual classifiers separately by learning the common feature representation via the shared hidden layer. Experimental results on an isolated English word corpus recorded by non-native (L2) English learners show that the proposed GOP measure can improve the performance of GOP based mispronunciation detection approach, i.e., 7.4% of the precision and recall rate are both improved, compared with the conventional GOP estimated from GMM-HMM. The NN-based LR classifier improves the equal precision-recall rate by 25% over the best GOP based approach. It also outperforms the state-of-art Support Vector Machine (SVM) based classifier by 2.2% of equal precision-recall rate improvement. Our approaches also achieve similar results on a continuous read, L2 Mandarin language learning corpus.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 67, March 2015, Pages 154-166

نویسندگان

Wenping Hu, Yao Qian, Frank K. Soong, Yong Wang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : بهبود تشخیص اشتباه با استفاده از شبکه عمیق عصبی مدل های صوتی آموزش داده شده و طبقه بندی های رگرسیون لجستیک مبتنی بر یادگیری

دسترسی سریع

ارتباط

English Website