کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
406045 678056 2015 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition
ترجمه فارسی عنوان
یک مطالعه مقایسه ای در مورد انتخاب واحدهای مدل سازی صوتی در شبکه های عصبی عمیق مبتنی بر واژگان بزرگ تشخیص گفتار چینی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

This paper compared the performance of different acoustic modeling units in deep neural networks (DNNs) based large vocabulary continuous speech recognition (LVCSR) systems for Chinese. Recently, the deep neural networks based acoustic modeling method has achieved very competitive performance for many speech recognition tasks, and has become the focus of current LVCSR research. Some previous work have studied the context independent and context dependent DNNs based acoustic models. For Chinese, a syllabic language, the choice of basic modeling units under the background of DNNs based LVCSR systems is a very important issue.Three basic modeling units, syllables, initial/finals, phones, are discussed and compared. Experimental results show that, in the DNNs based systems, the context dependent (CD) phones obtain the best performance, and the context independent (CI) syllables have the similar performance with the CD initial/finals. How the number of clustered states impacts on the performance of DNNs based systems is also discussed, which showed different properties from the GMMs based systems. Besides, through introducing the multi-task learning strategy, these multiple modeling units can be combined in the DNNs training procedure. The experimental results indicate that combining these multiple modeling units using multi-task learning outperforms each individual modeling unit.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 170, 25 December 2015, Pages 251–256
نویسندگان
, , , ,