کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
565282 1452022 2016 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Compositional model for speech denoising based on source/filter speech representation and smoothness/sparseness noise constraints
ترجمه فارسی عنوان
مدل کامپوزیتی برای انعکاس گفتار بر اساس نمایه سخنرانی منبع / فیلتر و محدودیت های صوتی / ضریب نویز
کلمات کلیدی
جداسازی صوتی منبع؛جدایی گفتاری؛تقویت گفتار؛تقسیم ماتریس غیر منفی؛مدل های ترکیبی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• Semi-supervised NMF with source/filter speech model and constrained noise parameters.
• Evaluated on the 3rd CHiME and SiSEC 2013 datasets for speech and noise separation.
• Proposed noise constraints help to improve the isolation of speech with real noises.
• All tested environments but one could be modeled by the constraints.
• Better separation results than conventional semi-supervised sparse NMF.

We present a speech denoising algorithm based on a regularized non-negative matrix factorization (NMF), in which several constraints are defined to describe the background noise in a generic way. The observed spectrogram is decomposed into four signal contributions: the voiced speech source and three generic types of noise. The speech signal is represented by a source/filter model which captures only voiced speech, and where the filter bases are trained on a database of individual phonemes, resulting in a small dictionary of phoneme envelopes. The three remaining terms represent the background noise as a sum of three different types of noise (smooth noise, impulsive noise and pitched noise), where each type of noise is characterized individually by imposing specific spectro-temporal constraints, based on sparseness and smoothness restrictions. The method was evaluated on the 3rd CHiME Speech Separation and Recognition Challenge development dataset and compared with conventional semi-supervised NMF with sparse activations. Our experiments show that, with a similar number of bases, source/filter modeling of speech in conjunction with the proposed noise constraints produces better separation results than sparse training of speech bases, even though the system is only designed for voiced speech and the results may still not be practical for many applications.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 78, April 2016, Pages 84–99
نویسندگان
, , , , ,