Monaural speech segregation based on fusion of source-driven with model-driven techniques

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
568930	876489	2007	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Vector Quantization - کوانتیزاسیون برداری Speech coding - برنامه نویسی گفتاری CASA - خانه Speech processing - پردازش گفتار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Monaural speech segregation based on fusion of source-driven with model-driven techniques

چکیده انگلیسی

In this paper by exploiting the prevalent methods in speech coding and synthesis, a new single channel speech segregation technique is presented. The technique integrates a model-driven method with a source-driven method to take advantage of both individual approaches and reduce their pitfalls significantly. We apply harmonic modelling in which the pitch and spectrum envelope are the main components for the analysis and synthesis stages. Pitch values of two speakers are obtained by using a source-driven method. The spectrum envelope, is obtained by using a new model-driven technique consisting of four components: a trained codebook of the vector quantized envelopes (VQ-based separation), a mixture-maximum approximation (MIXMAX), minimum mean square error estimator (MMSE), and a harmonic synthesizer. In contrast with previous model-driven techniques, this approach is speaker independent and can separate out the unvoiced regions as well as suppress the crosstalk effect which both are the drawbacks of source-driven or equivalently computational auditory scene analysis (CASA) models. We compare our fused model with both model- and source-driven techniques by conducting subjective and objective experiments. The results show that although for the speaker-dependent case, model-based separation delivers the best quality, for a speaker independent scenario the integrated model outperforms the individual approaches. This result supports the idea that the human auditory system takes on both grouping cues (e.g., pitch tracking) and a priori knowledge (e.g., trained quantized envelopes) to segregate speech signals.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 49, Issue 6, June 2007, Pages 464–476

نویسندگان

Mohammad H. Radfar, Richard M. Dansereau, Abolghasem Sayadiyan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Monaural speech segregation based on fusion of source-driven with model-driven techniques

دسترسی سریع

ارتباط

English Website