Prior-shared feature and model space speaker adaptation by consistently employing map estimation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565936	875866	2013	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speech recognition - تشخیص گفتار Speaker adaptation - سازگاری بلندگو

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Prior-shared feature and model space speaker adaptation by consistently employing map estimation

چکیده انگلیسی

The purpose of this paper is to describe the development of a speaker adaptation method that improves speech recognition performance regardless of the amount of adaptation data. For that purpose, we propose the consistent employment of a maximum a posteriori (MAP)-based Bayesian estimation for both feature space normalization and model space adaptation. Namely, constrained structural maximum a posteriori linear regression (CSMAPLR) is first performed in a feature space to compensate for the speaker characteristics, and then, SMAPLR is performed in a model space to capture the remaining speaker characteristics. A prior distribution stabilizes the parameter estimation especially when the amount of adaptation data is small. In the proposed method, CSMAPLR and SMAPLR are performed based on the same acoustic model. Therefore, the dimension-dependent variations of feature and model spaces can be similar. Dimension-dependent variations of the transformation matrix are explained well by the prior distribution. Therefore, by sharing the same prior distribution between CSMAPLR and SMAPLR, their parameter estimations can be appropriately regularized in both spaces. Experiments on large vocabulary continuous speech recognition using the Corpus of Spontaneous Japanese (CSJ) and the MIT OpenCourseWare corpus (MIT-OCW) confirm the effectiveness of the proposed method compared with other conventional adaptation methods with and without using speaker adaptive training.

► Maximum a posteriori (MAP)-based Bayesian estimation in feature and model space.
► Prior distribution sharing for estimating feature and model space transformation matrix.
► Appropriate regularization for their parameter estimations in both spaces.
► Scalable performance improvement of the proposed approach.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 55, Issue 3, March 2013, Pages 415–431

نویسندگان

Seong-Jun Hahm, Shinji Watanabe, Atsunori Ogawa, Masakiyo Fujimoto, Takaaki Hori, Atsushi Nakamura,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Prior-shared feature and model space speaker adaptation by consistently employing map estimation

دسترسی سریع

ارتباط

English Website