دانلود رایگان مقاله: شبکه های عصبی مجازی عمیق المن برای سنتز گفتاری پارامتری

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4977794	1452007	2017	31 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Deep Elman recurrent neural networks for statistical parametric speech synthesis

ترجمه فارسی عنوان

شبکه های عصبی مجازی عمیق المن برای سنتز گفتاری پارامتری

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

سنتز گفتار، شبکه عصبی مکرر، شبکه های عمیق عصبی، حالت مخفی،

Speech synthesis - سنتز گفتار Recurrent neural networks - شبکه های عصبی راجعه Deep neural networks - شبکه های عصبی عمیق

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

شبکه های عصبی مجازی عمیق المن برای سنتز گفتاری پارامتری

چکیده انگلیسی

Owing to the success of deep learning techniques in automatic speech recognition, deep neural networks (DNNs) have been used as acoustic models for statistical parametric speech synthesis (SPSS). DNNs do not inherently model the temporal structure in speech and text, and hence are not well suited to be directly applied to the problem of SPSS. Recurrent neural networks (RNN) on the other hand have the capability to model time-series. RNNs with long short-term memory (LSTM) cells have been shown to outperform DNN based SPSS. However, LSTM cells and its variants like gated recurrent units (GRU), simplified LSTMs (SLSTM) have complicated structure and are computationally expensive compared to the simple recurrent architecture like Elman RNN. In this paper, we explore deep Elman RNNs for SPSS and compare their effectiveness against deep gated RNNs. Specifically, we perform experiments to show that (1) Deep Elman RNNs are better suited for acoustic modeling in SPSS when compared to DNNs and perform competitively to deep SLSTMs, GRUs and LSTMs, (2) Context representation learning using Elman RNNs improves neural network acoustic models for SPSS, and (3) Elman RNN based duration model is better than the DNN based counterpart. Experiments were performed on Blizzard Challenge 2015 dataset consisting of 3 Indian languages (Telugu, Hindi and Tamil). Through subjective and objective evaluations, we show that our proposed systems outperform the baseline systems across different speakers and languages.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 93, October 2017, Pages 31-42

نویسندگان

Sivanand Achanta, Suryakanth V Gangashetty,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : شبکه های عصبی مجازی عمیق المن برای سنتز گفتاری پارامتری

دسترسی سریع

ارتباط

English Website