دانلود رایگان مقاله: استفاده از داده های خارج از زبان برای بهبود شناسایی گفتار با منابع محدود

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567029	1452042	2014	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Using out-of-language data to improve an under-resourced speech recognizer

ترجمه فارسی عنوان

استفاده از داده های خارج از زبان برای بهبود شناسایی گفتار با منابع محدود

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

multilingual speech recognition Afrikaans - آفریکانس Under-resourced languages - زبان های زیرزمینی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

استفاده از داده های خارج از زبان برای بهبود شناسایی گفتار با منابع محدود

چکیده انگلیسی

Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we report how to boost the performance of an Afrikaans automatic speech recognition system by using already available Dutch data. We successfully exploit available multilingual resources through (1) posterior features, estimated by multilayer perceptrons (MLP) and (2) subspace Gaussian mixture models (SGMMs). Both the MLPs and the SGMMs can be trained on out-of-language data. We use three different acoustic modeling techniques, namely Tandem, Kullback–Leibler divergence based HMMs (KL-HMM) as well as SGMMs and show that the proposed multilingual systems yield 12% relative improvement compared to a conventional monolingual HMM/GMM system only trained on Afrikaans. We also show that KL-HMMs are extremely powerful for under-resourced languages: using only six minutes of Afrikaans data (in combination with out-of-language data), KL-HMM yields about 30% relative improvement compared to conventional maximum likelihood linear regression and maximum a posteriori based acoustic model adaptation.

► We boost the performance of an Afrikaans speech recognizer by using Dutch data.
► We successfully exploit multilingual resources with Tandem, KL-HMM and SGMMs.
► The proposed systems yield 12% relative improvement compared to a monolingual system.
► With only six minutes of data, KL-HMM outperforms all other systems including MLLR and MAP.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 56, January 2014, Pages 142–151

نویسندگان

David Imseng, Petr Motlicek, Hervé Bourlard, Philip N. Garner,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : استفاده از داده های خارج از زبان برای بهبود شناسایی گفتار با منابع محدود

دسترسی سریع

ارتباط

English Website