کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4977826 1452011 2017 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks
ترجمه فارسی عنوان
تشخیص خطا و تعیین دقت در تشخیص گفتار خودکار با استفاده از شبکه های عصبی مجدد عمیق دو طرفه
کلمات کلیدی
شناسایی خودکار گفتار، شناسایی خطا، برآورد دقت، زمینه های تصادفی محض، شبکههای عصبی مجدد عمیق دو طرفه،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی
Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied to error detection in automatic speech recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection and error type classification. We also estimate ASR accuracy, i.e. percent correct and word accuracy, from the error type classification results. Experimental results using English and Japanese lecture speech corpora show that the DBRNNs greatly outperform conditional random fields (CRFs) and the other NN structures, i.e. deep feedforward NNs (DNNs) and deep unidirectional RNNs (DURNNs). These performance improvements are because the DBRNNs can take the longer bidirectional context of input feature vectors into account and can model highly nonlinear relationships between the input feature vectors and output labels. In detailed analyses, the DBRNNs show a better generalization ability than the CRFs. These results are thanks to the ability of the DBRNNs to represent (embed) the words in a low-dimensional continuous value vector space. In addition, the superiority of the DBRNNs to the DNNs and DURNNs indicates that the average length of the context of the input feature vectors required for ASR error detection is only a few time steps, however, it will change (lengthen) depending on the situation.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 89, May 2017, Pages 70-83
نویسندگان
, ,