کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6960839 1452004 2018 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings
چکیده انگلیسی
This paper proposes a blind source separation (BSS) method for recovering multiple speech sources from sound fields recorded by a B-format microphone. This microphone provides a four channel representation that can be used to derive the direction of arrival (DOA) of spatially distinct time-frequency (TF) components. Such sparse components correspond to bins where only one speech source is active and are identified based on the inter-correlation among the mixture signals. They are recovered via a degenerate unmixing estimation technique (DUET)-like method. Proposed is a “local-zone stationarity” assumption, where the amplitude of a speech signal remains approximately constant within a small band of TF components. This assumption is validated through statistical analysis of a quantitative measure of stationarity. Under this assumption, the non-sparse components (TF points where more than one speech source is active) are recovered via a Wiener-filter-like approach where the separated sparse components is utilized as a guide. The final separated sources are obtained by combining the separated sparse and non-sparse components. Both objective and subjective evaluations show that the proposed method achieves better separation quality compared to some existing BSS approaches where up to six simultaneous speech sources are considered.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 96, February 2018, Pages 184-196
نویسندگان
, , , ,