کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
559012 875029 2015 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Text-to-speech synthesis system with Arabic diacritic recognition system
ترجمه فارسی عنوان
سیستم تبدیل متن به گفتار با سیستم تشخیص دیاگرامی عربی
کلمات کلیدی
تبدیل متن به گفتار؛پارامترهای آماری؛شبکه های عمیق عصبی؛پردازش زبان طبیعی؛سیستم دایکریتیزاسیون
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We developed an Arabic text-to-speech system, including a diacritization system.
• The speech synthesis system is based on statistical parametric.
• We address the accuracy of diacritic and acoustic models.
• We proposed a diacritization system based on the position of the current letter.
• Neural network per unit type based synthesis system generates high speech quality.

Text-to-speech synthesis system has been widely studied for many languages. However, speech synthesis for Arabic language has not sufficient progresses and it is still in its first stage. Statistical parametric synthesis based on hidden Markov models was the most commonly applied approach for Arabic language. Recently, synthesized speech quality based on deep neural networks was found as intelligible as human voice. This paper describes a Text-To-Speech (TTS) synthesis system for modern standard Arabic language based on statistical parametric approach and Mel-cepstral coefficients. Deep neural networks achieved state-of-the-art performance in a wide range of tasks, including speech synthesis. Our TTS system includes a diacritization system which is very important for Arabic TTS application. Our diacritization system is also based on deep neural networks. In addition to the use deep techniques, different methods were also proposed to model the acoustic parameters in order to address the problem of acoustic models accuracy. They are based on linguistic and acoustic characteristics (e.g. letter position based diacritization system, unit types based synthesis system, diacritic marks based synthesis system) and based on deep learning techniques (stacked generalization techniques). Experimental results show that our diacritization system can generate a diacritized text with high accuracy. As regards the speech synthesis system, the experimental results and subjective evaluation show that our proposed method for synthesis system can generate intelligible and natural speech.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 34, Issue 1, November 2015, Pages 43–60
نویسندگان
, ,