کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6959260 | 1451954 | 2015 | 14 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Automatic adaptive speech separation using beamformer-output-ratio for voice activity classification
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
پردازش سیگنال
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
This paper focuses on the practical challenge of adaptation control for speech separation systems. Adaptive beamforming methods, such as minimum variance distortionless response (MDVR), can effectively extract the desired speech signal from interference and noise. However, to avoid the signal cancellation problem, the beamformer adaptation is halted when the desired speaker is active. An automated scheme for this adaptation requires classifying speakers׳ voice activity status, which remains a challenge for multi-speaker environments. In this paper, we propose a novel approach to identify voice activities for two speakers based on a new metric, called the beamformer-output-ratio (BOR). Statistical properties of the BOR are studied and used to develop a hypothesis-based method for voice activity classification. The method is further refined using an algorithm detecting incorrect beamformer adaptation by analysing changes in the output power of a blind adapting MVDR beamformer. Based on the new methods, we construct an automatic adaptive beamforming system to simultaneously separate speech for two speakers. The speech separation module of the system uses MVDR beamformers whose adaptation is guided by the voice activity classification. Our methods can lead to, in some cases, 20% reduction in voice activity classification error, and 8 dB improvement on the output SINR. The results are verified on both synthesised signals and realistic recordings.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Signal Processing - Volume 113, August 2015, Pages 259-272
Journal: Signal Processing - Volume 113, August 2015, Pages 259-272
نویسندگان
Thuy Ngoc Tran, William Cowley, André Pollok,