کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
497031 862875 2011 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Audio–visual speaker identification using dynamic facial movements and utterance phonetic content
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Audio–visual speaker identification using dynamic facial movements and utterance phonetic content
چکیده انگلیسی

Robust multimodal identification systems based on audio–visual information has not been thoroughly investigated yet. The aim of this work is to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently imitation of valid speakers could be reduced to a large extent. These parameters are applied to a Hidden Markov Model (HMM) audio–visual identification system. In this work a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. The proposed model is compared to other feature extraction methods including Kalman filtering, neural networks, adaptive network fuzzy inference system (ANFIS) and auto recursive moving average. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. The combined Kalman filtering and proposed model led to the best performance. The phonetic content of pronounced sentences is also evaluated to achieve the optimized phonetic combinations which lead to the best identification rate.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 11, Issue 2, March 2011, Pages 2083–2093
نویسندگان
, , ,