کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
384234 660843 2010 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Robust speech interaction in motorcycle environment
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Robust speech interaction in motorcycle environment
چکیده انگلیسی

Aiming at robust spoken dialogue interaction in motorcycle environment, we investigate various configurations for a speech front-end, which consists of speech pre-processing, speech enhancement and speech recognition components. These components are implemented as agents in the Olympus/RavenClaw framework, which is the core of a multimodal dialogue interaction interface of a wearable solution for information support of the motorcycle police force on the move. In the present effort, aiming at optimizing the speech recognition performance, different experimental setups are considered for the speech front-end. The practical value of various speech enhancement techniques is assessed and, after analysis of their performances, a collaborative scheme is proposed. In this collaborative scheme independent speech enhancement channels operate in parallel on a common input and their outputs are fed to the multithread speech recognition component. The outcome of the speech recognition process is post-processed by an appropriate fusion technique, which contributes for a more accurate interpretation of the input. Investigating various fusion algorithms, we identified the Adaboost.M1 algorithm as the one performing best. Utilizing the fusion collaborative scheme based on the Adaboost.M1 algorithm, significant improvement of the overall speech recognition performance was achieved. This is expressed in terms of word recognition rate and correctly recognized words, as accuracy gain of 8.0% and 5.48%, respectively, when compared to the performance of the best speech enhancement channel, alone. The advance offered in the present work reaches beyond the specifics of the present application, and can be beneficial to spoken interfaces operating in non-stationary noise environments.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 37, Issue 3, 15 March 2010, Pages 1827–1835
نویسندگان
, , , ,