کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
485455 703327 2016 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla
چکیده انگلیسی

We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of new under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 81, 2016, Pages 194–200
نویسندگان
, , , , , ,