کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566006 1452024 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Binaural rendering of microphone array captures based on source separation
ترجمه فارسی عنوان
ارائه دوزبانه آرایه میکروفن قطاری بر اساس جدایی منبع است
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• A method for binaural rendering of sound scene recordings is proposed.
• Source signals and their direction of arrival is estimated using a microphone array.
• A low-rank NMF model for separation of sound sources is used.
• Speech intelligibility test with overlapping speech is conducted.
• The speech intelligibility by binaural processing is shown to increase over stereo.

This paper proposes a method for binaural reconstruction of a sound scene captured with a portable-sized array consisting of several microphones. The proposed processing is separating the scene into a sum of small number of sources, and the spectrogram of each of them is in turn represented as a small number of latent components. The direction of arrival (DOA) of each source is estimated, which is followed by binaural rendering of each source at its estimated direction. For representing the sources, the proposed method uses low-rank complex-valued non-negative matrix factorization combined with DOA-based spatial covariance matrix model. The binaural reconstruction is achieved by applying the binaural cues (head-related transfer function) associated with the estimated source DOA to the separated source signals. The binaural rendering quality of the proposed method was evaluated using a speech intelligibility test. The test results indicated that the proposed binaural rendering was able to improve the intelligibility of speech over stereo recordings and separation by minimum variance distortionless response beamformer with the same binaural synthesis in a three-speaker scenario. An additional listening test evaluating the subjective quality of the rendered output indicates no added processing artifacts by the proposed method in comparison to unprocessed stereo recording.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 76, February 2016, Pages 157–169
نویسندگان
, , , ,