Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565411	875759	2009	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Unsupervised adaptation Rapid adaptation Speaker adaptation - سازگاری بلندگو

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics

چکیده انگلیسی

In realizing a speech recognition system robust to variation of speakers, a reliable adaptation algorithm is needed. Most adaptation techniques require a large amount of adaptation data from the target speaker to carry out the adaptation task. With the time needed to gather and transcribe adaptation utterances together with the time to execute adaptation, application to speech recognition is limited. We propose a rapid approach to speaker adaptation. We employ HMM-Sufficient Statistics in storing speaker-dependent subspaces. N-Closest speaker selection is employed in resolving the combinatorics of the speaker-dependent subspaces during recognition. This approach allows the adapted model to have a direct correspondence with the target speaker by using the target speakers’ utterance for the N-Closest speaker selection. The proposed method employs series of adaptation processes. First, the general model is trained, then adapted to broad gender/age classes, which are further adapted to speaker-specific data. Since HMM-Sufficient Statistics are pre-computed offline, little computation is needed in carrying out the adaptation task online. Moreover, the method requires only a single arbitrary utterance from the target speaker for adaptation. In this paper, we discuss the modification, expansion, and the improvement of rapid adaptation based on HMM-Sufficient Statistics in the framework of Baum-Welch and maximum likelihood linear regression (MLLR). Experimental results using the conventional MLLR, speaker-adaptive training, and CMLLR are evaluated and compared. We also tested for robustness in office, car, crowd and booth environments in several SNR conditions.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 51, Issue 1, January 2009, Pages 42–57

نویسندگان

Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics

دسترسی سریع

ارتباط

English Website