دانلود رایگان مقاله: تشخیص گفتار بزرگ روسیه با استفاده از مدل سازی زبان آماری نحوی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567034	1452042	2014	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

ترجمه فارسی عنوان

تشخیص گفتار بزرگ روسیه با استفاده از مدل سازی زبان آماری نحوی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شناسایی خودکار گفتار، زبان اسلاوی سخنرانی روسی، مدل سازی زبان، تحلیل همگانی

Automatic speech recognition - تشخیص گفتار خودکار Language modeling - مدل سازی زبان

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

تشخیص گفتار بزرگ روسیه با استفاده از مدل سازی زبان آماری نحوی

چکیده انگلیسی

• An approach for LM creation combining syntactical and statistical analysis of training texts.
• A combined knowledge-based statistical phoneme set selection method for obtaining an optimal set for ASR.
• Results of the experiments on Russian ASR with a large vocabulary over 200 K words.

Speech is the most natural way of human communication and in order to achieve convenient and efficient human–computer interaction implementation of state-of-the-art spoken language technology is necessary. Research in this area has been traditionally focused on several main languages, such as English, French, Spanish, Chinese or Japanese, but some other languages, particularly Eastern European languages, have received much less attention. However, recently, research activities on speech technologies for Czech, Polish, Serbo-Croatian, Russian languages have been steadily increasing.In this paper, we describe our efforts to build an automatic speech recognition (ASR) system for the Russian language with a large vocabulary. Russian is a synthetic and highly inflected language with lots of roots and affixes. This greatly reduces the performance of the ASR systems designed using traditional approaches. In our work, we have taken special attention to the specifics of the Russian language when developing the acoustic, lexical and language models. A special software tool for pronunciation lexicon creation was developed. For the acoustic model, we investigated a combination of knowledge-based and statistical approaches to create several different phoneme sets, the best of which was determined experimentally. For the language model (LM), we introduced a new method that combines syntactical and statistical analysis of the training text data in order to build better n-gram models.Evaluation experiments were performed using two different Russian speech databases and an internally collected text corpus. Among the several phoneme sets we created, the one which achieved the fewest word level recognition errors was the set with 47 phonemes and thus we used it in the following language modeling evaluations. Experiments with 204 thousand words vocabulary ASR were performed to compare the standard statistical n-gram LMs and the language models created using our syntactico-statistical method. The results demonstrated that the proposed language modeling approach is capable of reducing the word recognition errors.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 56, January 2014, Pages 213–228

نویسندگان

Alexey Karpov, Konstantin Markov, Irina Kipyatkova, Daria Vazhenina, Andrey Ronzhin,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تشخیص گفتار بزرگ روسیه با استفاده از مدل سازی زبان آماری نحوی

دسترسی سریع

ارتباط

English Website