Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6960839	1452004	2018	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Sparsity - انعطاف پذیری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings

چکیده انگلیسی

This paper proposes a blind source separation (BSS) method for recovering multiple speech sources from sound fields recorded by a B-format microphone. This microphone provides a four channel representation that can be used to derive the direction of arrival (DOA) of spatially distinct time-frequency (TF) components. Such sparse components correspond to bins where only one speech source is active and are identified based on the inter-correlation among the mixture signals. They are recovered via a degenerate unmixing estimation technique (DUET)-like method. Proposed is a “local-zone stationarity” assumption, where the amplitude of a speech signal remains approximately constant within a small band of TF components. This assumption is validated through statistical analysis of a quantitative measure of stationarity. Under this assumption, the non-sparse components (TF points where more than one speech source is active) are recovered via a Wiener-filter-like approach where the separated sparse components is utilized as a guide. The final separated sources are obtained by combining the separated sparse and non-sparse components. Both objective and subjective evaluations show that the proposed method achieves better separation quality compared to some existing BSS approaches where up to six simultaneous speech sources are considered.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 96, February 2018, Pages 184-196

نویسندگان

Maoshen Jia, Jundai Sun, Changchun Bao, Christian Ritz,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings

دسترسی سریع

ارتباط

English Website