کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
565915 1452041 2014 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Predicting synthetic voice style from facial expressions. An application for augmented conversations
ترجمه فارسی عنوان
پیش بینی سبک صدای مصنوعی از عبارات صورت. یک برنامه مکالمات افزوده
کلمات کلیدی
سنتز سخنرانی فوری حالات چهره، کاربرد چند منظوره، ارتباطات تقویتی و جایگزین
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• A new method for processing emotional content in speech generating devices.
• Expressive speech synthesis is generated matching the user’s facial expression.
• Facial expressions are mapped to speech, based on emotional intensity classification.
• A system is developed to evaluate and personalise the expression classification.
• Results indicate that the system may improve communication experience for users.

The ability to efficiently facilitate social interaction and emotional expression is an important, yet unmet requirement for speech generating devices aimed at individuals with speech impairment. Using gestures such as facial expressions to control aspects of expressive synthetic speech could contribute to an improved communication experience for both the user of the device and the conversation partner. For this purpose, a mapping model between facial expressions and speech is needed, that is high level (utterance-based), versatile and personalisable. In the mapping developed in this work, visual and auditory modalities are connected based on the intended emotional salience of a message: the intensity of facial expressions of the user to the emotional intensity of the synthetic speech. The mapping model has been implemented in a system called WinkTalk that uses estimated facial expression categories and their intensity values to automatically select between three expressive synthetic voices reflecting three degrees of emotional intensity. An evaluation is conducted through an interactive experiment using simulated augmented conversations. The results have shown that automatic control of synthetic speech through facial expressions is fast, non-intrusive, sufficiently accurate and supports the user to feel more involved in the conversation. It can be concluded that the system has the potential to facilitate a more efficient communication process between user and listener.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 57, February 2014, Pages 63–75
نویسندگان
, , , , ,