Comprehensive many-to-many phoneme-to-viseme mapping and its application for concatenative visual speech synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567128	1452043	2013	20 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Comprehensive many-to-many phoneme-to-viseme mapping and its application for concatenative visual speech synthesis

چکیده انگلیسی

The use of visemes as atomic speech units in visual speech analysis and synthesis systems is well-established. Viseme labels are determined using a many-to-one phoneme-to-viseme mapping. However, due to visual coarticulation effects, an accurate mapping from phonemes to visemes should define a many-to-many mapping scheme instead. In this research it was found that neither the use of standardized nor speaker-dependent many-to-one viseme labels could satisfy the quality requirements of concatenative visual speech synthesis. Therefore, a novel technique to define a many-to-many phoneme-to-viseme mapping scheme is introduced, which makes use of both tree-based and k-means clustering approaches. We show that these many-to-many viseme labels more accurately describe the visual speech information as compared to both phoneme-based and many-to-one viseme-based speech labels. In addition, we found that the use of these many-to-many visemes improves the precision of the segment selection phase in concatenative visual speech synthesis using limited speech databases. Furthermore, the resulting synthetic visual speech was both objectively and subjectively found to be of higher quality when the many-to-many visemes are used to describe the speech database and the synthesis targets.

► The use of many-to-one phoneme-to-viseme mappings for visual speech synthesis was evaluated.
► These many-to-one mappings are unable to accurately describe the visual speech information.
► Novel many-to-many phoneme-to-viseme mapping schemes were constructed.
► These are able to describe the visual speech information more accurately than other approaches.
► They improve the quality of concatenative visual speech synthesis using limited databases.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 55, Issues 7–8, September 2013, Pages 857–876

نویسندگان

Wesley Mattheyses, Lukas Latacz, Werner Verhelst,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Comprehensive many-to-many phoneme-to-viseme mapping and its application for concatenative visual speech synthesis

دسترسی سریع

ارتباط

English Website