Multiple camera in car audio–visual speech recognition using phonetic and visemic information

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
455599	695516	2015	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

AAM Speech recognition - تشخیص گفتار Phoneme - فونم

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات

پیش نمایش صفحه اول مقاله

Multiple camera in car audio–visual speech recognition using phonetic and visemic information

چکیده انگلیسی

This paper presents a phonetic and visemic information-based audio–visual speech recognizer (AVSR). Active appearance model (AAM) is used to extract the visual features as it finely represents the shape and appearance information extracted from jaw and lip region. Consideration of visual features along with traditional acoustic feature has been found to be promising in the complex auditory environment. However, most of the existing AVSR systems rarely faced the visual domain problems. In this work, a real world multiple camera corpus audio visual in car (AVICAR) is used for the speech recognition experiment. Texas Instruments and Massachusetts Institute of Technology (TIMIT) corpus sentence portion is used to study the performance of bimodal audio–visual speech recognizer. To consider “Mc-Guruk” effect, acoustic and visual models are trained according to phonetic and visemic information, respectively. Phonetic–visemic AVSR system shows significant improvement over phonetic AVSR system.

Figure optionsDownload as PowerPoint slide

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Electrical Engineering - Volume 47, October 2015, Pages 35–50

نویسندگان

Astik Biswas, P.K. Sahu, Mahesh Chandra,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Multiple camera in car audio–visual speech recognition using phonetic and visemic information

دسترسی سریع

ارتباط

English Website