کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
455599 695516 2015 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multiple camera in car audio–visual speech recognition using phonetic and visemic information
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Multiple camera in car audio–visual speech recognition using phonetic and visemic information
چکیده انگلیسی

This paper presents a phonetic and visemic information-based audio–visual speech recognizer (AVSR). Active appearance model (AAM) is used to extract the visual features as it finely represents the shape and appearance information extracted from jaw and lip region. Consideration of visual features along with traditional acoustic feature has been found to be promising in the complex auditory environment. However, most of the existing AVSR systems rarely faced the visual domain problems. In this work, a real world multiple camera corpus audio visual in car (AVICAR) is used for the speech recognition experiment. Texas Instruments and Massachusetts Institute of Technology (TIMIT) corpus sentence portion is used to study the performance of bimodal audio–visual speech recognizer. To consider “Mc-Guruk” effect, acoustic and visual models are trained according to phonetic and visemic information, respectively. Phonetic–visemic AVSR system shows significant improvement over phonetic AVSR system.

Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Electrical Engineering - Volume 47, October 2015, Pages 35–50
نویسندگان
, , ,