دانلود رایگان مقاله: با استفاده از منابع سخنرانی آمریکای شمالی برای توسعه یک سیستم تشخیص گفتار بزرگ واژگان انگلیسی آفریقای جنوبی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
559022	875034	2014	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Capitalising on North American speech resources for the development of a South African English large vocabulary speech recognition system

ترجمه فارسی عنوان

با استفاده از منابع سخنرانی آمریکای شمالی برای توسعه یک سیستم تشخیص گفتار بزرگ واژگان انگلیسی آفریقای جنوبی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

زبانهای تحت منابع محدود، سخنرانی تند آفریقای جنوبی انواع انگلیسی

Under-resourced languages - زبان های زیرزمینی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

با استفاده از منابع سخنرانی آمریکای شمالی برای توسعه یک سیستم تشخیص گفتار بزرگ واژگان انگلیسی آفریقای جنوبی

چکیده انگلیسی

• We try to develop or improve a South African (SA) speech recogniser with US data.
• The SA domain is under-resourced while the US domain is very well-resourced.
• US pronunciations and language models can feasibly replace SA counterparts.
• US acoustic data used in a SA system results in a large performance penalty.
• US acoustic and language model data slightly improve a SA system by adaptation.

South African English is currently considered an under-resourced variety of English. Extensive speech resources are, however, available for North American (US) English. In this paper we consider the use of these US resources in the development of a South African large vocabulary speech recognition system. Specifically we consider two research questions. Firstly, we determine the performance penalties that are incurred when using US instead of South African language models, pronunciation dictionaries and acoustic models. Secondly, we determine whether US acoustic and language modelling data can be used in addition to the much more limited South African resources to improve speech recognition performance. In the first case we find that using a US pronunciation dictionary or a US language model in a South African system results in fairly small penalties. However, a substantial penalty is incurred when using a US acoustic model. In the second investigation we find that small but consistent improvements over a baseline South African system can be obtained by the additional use of US acoustic data. Larger improvements are obtained when complementing the South African language modelling data with US and/or UK material. We conclude that, when developing resources for an under-resourced variety of English, the compilation of acoustic data should be prioritised, language modelling data has a weaker effect on performance and the pronunciation dictionary the smallest.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 28, Issue 6, November 2014, Pages 1255–1268

نویسندگان

Herman Kamper, Febe de Wet, Thomas Hain, Thomas Niesler,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : با استفاده از منابع سخنرانی آمریکای شمالی برای توسعه یک سیستم تشخیص گفتار بزرگ واژگان انگلیسی آفریقای جنوبی

دسترسی سریع

ارتباط

English Website