کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563167 875473 2012 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Self-learning speaker identification for enhanced speech recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Self-learning speaker identification for enhanced speech recognition
چکیده انگلیسی

A novel approach for joint speaker identification and speech recognition is presented in this article. Unsupervised speaker tracking and automatic adaptation of the human–computer interface is achieved by the interaction of speaker identification, speech recognition and speaker adaptation for a limited number of recurring users. Together with a technique for efficient information retrieval a compact modeling of speech and speaker characteristics is presented. Applying speaker specific profiles allows speech recognition to take individual speech characteristics into consideration to achieve higher recognition rates. Speaker profiles are initialized and continuously adapted by a balanced strategy of short-term and long-term speaker adaptation combined with robust speaker identification. Different users can be tracked by the resulting self-learning speech controlled system. Only a very short enrollment of each speaker is required. Subsequent utterances are used for unsupervised adaptation resulting in continuously improved speech recognition rates. Additionally, the detection of unknown speakers is examined under the objective to avoid the requirement to train new speaker profiles explicitly. The speech controlled system presented here is suitable for in-car applications, e.g. speech controlled navigation, hands-free telephony or infotainment systems, on embedded devices. Results are presented for a subset of the SPEECON database. The results validate the benefit of the speaker adaptation scheme and the unified modeling in terms of speaker identification and speech recognition rates.


► Adapted speech recognition profiles can identify speakers.
► Speaker tracking allows long-term speaker adaptation.
► Combining speaker adaptation and speaker identification improves speech recognition rates.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 26, Issue 3, June 2012, Pages 210–227
نویسندگان
, , ,