Voice activity detection based on adjustable linear prediction and GARCH models

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565490	875764	2008	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Voice activity detection - تشخیص فعالیت صوتی Kalman filter - فیلتر کالمان یا فیلتر کالمن State-space representation - نمایندگی دولت-فضایی Linear prediction - پیش بینی خطی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Voice activity detection based on adjustable linear prediction and GARCH models

چکیده انگلیسی

We propose a method for voice activity detection (VAD) that employs a class of the Autoregressive–Generalized Autoregressive Conditional Heteroskedasticity (AR-GARCH) model. As regards correlated speech signals, we represent the AR part of the AR-GARCH model with a state-space to obtain the appropriate linear prediction error series. By applying the GARCH model to the residual, we estimate the conditional variance sequences corresponding to the voice activity parts. To detect voice activity, we establish an appropriate threshold for the conditional variance sequences. To confirm the performance of our proposed VAD method, we conduct experiments using speech signals with real background noise (signal-to-noise ratios (SNRs) of 10, 5 and 0 dB) of an airport and a street. Furthermore, using receiver operating characteristics curves and equal error rates, we compare our results with those of previous standardized VAD algorithms (ITU-T G.729B, ETSI ES 202 050, and ETSI EN 301 708) as well as recently developed methods (VAD with long-term spectral divergence, likelihood ratio tests, and higher-order statistics for VAD). In terms of the signals with background noise at an SNR of 0 dB, the experimental results show a significant performance improvement compared with standardized VAD algorithms and more than 10% improvement compared with recently developed VAD methods.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 50, Issue 6, June 2008, Pages 476–486

نویسندگان

Hiroko Kato Solvang, Kentaro Ishizuka, Masakiyo Fujimoto,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Voice activity detection based on adjustable linear prediction and GARCH models

دسترسی سریع

ارتباط

English Website