Acoustic speech unit segmentation for concatenative synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558405	874922	2008	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Acoustic speech unit segmentation for concatenative synthesis

چکیده انگلیسی

Synthesis by concatenation of natural speech improves perceptual results when phonemes and syllables are segmented at places where spectral variations are small [Klatt, D., 1987. Review of text-to-speech conversion for English. J. Acoust. Soc. Am 82 (3), 737–793]. An automatic segmentation method is explored here, using a tool based on a combination of Entropy Coding, Multiresolution Analysis, and Kohonen’s Self Organized Maps. The segmentation method considers that there are no limits imposed by any linguistic unit. Resulting waveforms represent phone chains dominated by spectral dynamic structures. Each acoustic unit obtained could be composed of a variety of phonemes or a segmented part of them at the unit boundary. The number of units and unit structure are speaker dependent, i.e. rate, segmental and suprasegmental distinctive features affect them as dynamic structure varies. Results obtained from two databases – one male, one female – of 741 sentences each show this dependence, presenting a different number of units and occurrences for each speaker. Nevertheless, both speakers show a high occurrence of three (36–24%) and four (29–27%) phoneme sequences. Vowel-consonant-vowel sequences are the most frequent type (9.7–8.3%). Consonant-vowel syllables, which are phonemically frequent in Spanish (58%), are less represented (6.6–3.2%) using this method. The relevance of half phone segmentation is verified given that 66% for the female speaker and 53% for the male speaker, of the total units start and end with a segmented phone. Perceptual experiments showed that concatenated speech, created with dynamic acoustic units, was judged more natural than with diphone units.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 22, Issue 2, April 2008, Pages 196–206

نویسندگان

H.M. Torres, J.A. Gurlekian,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Acoustic speech unit segmentation for concatenative synthesis

دسترسی سریع

ارتباط

English Website