Correlation based speech-video synchronization

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
534697	870280	2011	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Estimation - برآورد کردن Formants - فرمنج Correlation - همبستگی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Correlation based speech-video synchronization

چکیده انگلیسی

This paper presents a novel Lip synchronization technique which investigates the correlation between the speech and lips movements. First, the speech signal is represented as a nonlinear time-varying model which involves a sum of AM–FM signals. Each of these signals is employed to model a single Formant frequency. The model is realized using Taylor series expansion in a way which provides the relationship between the lip shape (width and height) w.r.t. the speech amplitude and instantaneous frequency. Using lips width and height, a semi-speech signal is generated and correlated with the original speech signal over a span of delays then the delay between the speech and the video is estimated. Using real and noisy data from the VidTimit and in-house diastases, the proposed method was able to estimate small delays of 0.01–0.1 s in the case of noise-less and noisy signals respectively with a maximum absolute error of 0.0022 s.

Research highlights
► A novel approach is proposed for the task of lip synchronization.
► The considered model and the derived analysis are consistent with the results.
► The method is able to correct for lip synchronization problems with minimal errors.
► In non-frontal faces, the method was able to achieve acceptable results.
► The method does not depend on any audio or video pilot signals.
► The method is blind to the type of the spoken language.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 32, Issue 6, 15 April 2011, Pages 780–786

نویسندگان

Amar A. EL-Sallam, Ajmal S. Mian,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Correlation based speech-video synchronization

دسترسی سریع

ارتباط

English Website