کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5005605 1369105 2017 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic speech recognizers for Mexican Spanish and its open resources
ترجمه فارسی عنوان
شناسایی گفتار خودکار برای اسپانیایی مکزیکی و منابع باز آن
کلمات کلیدی
شناسایی خودکار گفتار، اسپانیایی مکزیکی، منابع زبان، مدل زبان، مدل آکوستیک،
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی
چکیده انگلیسی
Development of automatic speech recognition systems relies on the availability of distinct language resources such as speech recordings, pronunciation dictionaries, and language models. These resources are scarce for the Mexican Spanish dialect. In this work, we present a revision of the CIEMPIESS corpus that is a resource for spontaneous speech recognition in Mexican Spanish of Central Mexico. It consists of 17 h of segmented and transcribed recordings, a phonetic dictionary composed by 53,169 unique words, and a language model composed by 1,505,491 words extracted from 2489 university newsletters. We also evaluate the CIEMPIESS corpus using three well known state of the art speech recognition engines, having satisfactory results. These resources are open for research and development in the field. Additionally, we present the methodology and the tools used to facilitate the creation of these resources which can be easily adapted to other variants of Spanish, or even other languages.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Applied Research and Technology - Volume 15, Issue 3, June 2017, Pages 259-270
نویسندگان
, , ,