Audio–visual speaker identification using dynamic facial movements and utterance phonetic content

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
497031	862875	2011	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Biometry - بیومتریک Kalman filtering - فیلتر کالمن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Audio–visual speaker identification using dynamic facial movements and utterance phonetic content

چکیده انگلیسی

Robust multimodal identification systems based on audio–visual information has not been thoroughly investigated yet. The aim of this work is to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently imitation of valid speakers could be reduced to a large extent. These parameters are applied to a Hidden Markov Model (HMM) audio–visual identification system. In this work a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. The proposed model is compared to other feature extraction methods including Kalman filtering, neural networks, adaptive network fuzzy inference system (ANFIS) and auto recursive moving average. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. The combined Kalman filtering and proposed model led to the best performance. The phonetic content of pronounced sentences is also evaluated to achieve the optimized phonetic combinations which lead to the best identification rate.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 11, Issue 2, March 2011, Pages 2083–2093

نویسندگان

Vahid Asadpour, Mohammad Mehdi Homayounpour, Farzad Towhidkhah,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Audio–visual speaker identification using dynamic facial movements and utterance phonetic content

دسترسی سریع

ارتباط

English Website