A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565364	875744	2009	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Knowledge based system - سیستم مبتنی بر دانش Large vocabulary continuous speech recognition - معنی لغوی واژگان بزرگ

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition

چکیده انگلیسی

In this paper, a lattice rescoring approach to integrating acoustic-phonetic information into automatic speech recognition (ASR) is described. Additional information over what is used in conventional log-likelihood based decoding is provided by a bank of speech event detectors that score manner and place of articulation events with log-likelihood ratios that are treated as confidence levels. An artificial neural network (ANN) is then used to transform raw log-likelihood ratio scores into manageable terms for easy incorporation. We refer to the union of the event detectors and the ANN as knowledge module. A goal of this study is to design a generic framework which makes it easier to incorporate other sources of information into an existing ASR system. Another aim is to start investigating the possibility of building a generic knowledge module that can be plugged into an ASR system without being trained on specific data for the given task. To this end, the proposed approach is evaluated on three diverse ASR tasks: continuous phone recognition, connected digit recognition, and large vocabulary continuous speech recognition, but the data-driven knowledge module is trained with a single corpus and used in all three evaluation tasks without further training. Experimental results indicate that in all three cases the proposed rescoring framework achieves better results than those obtained without incorporating the confidence scores provided by the knowledge module. It is interesting to note that the rescoring process is especially effective in correcting utterances with errors in large vocabulary continuous speech recognition, where constraints imposed by the lexical and language models sometimes produce recognition results not strictly observing the underlying acoustic-phonetic properties.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 51, Issue 11, November 2009, Pages 1139–1153

نویسندگان

Sabato Marco Siniscalchi, Chin-Hui Lee,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition

دسترسی سریع

ارتباط

English Website