کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567776 876155 2006 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end
چکیده انگلیسی

The aim of this work is to enable a noise-free time-domain speech signal to be reconstructed from a stream of MFCC vectors and fundamental frequency and voicing estimates, such as may be received in a distributed speech recognition system. To facilitate reconstruction, both a sinusoidal model and a source-filter model of speech are compared by listening tests and spectrogram analysis, with the result that the former provides higher quality speech reconstruction. Analysis of the sinusoidal model shows that for clean speech reconstruction, both a noise-free spectral envelope and a robust estimate of the fundamental frequency and voicing are necessary. Investigation into fundamental frequency estimation reveals that an auditory model based approach gives superior performance over other methods of estimation. This leads to the proposal of an integrated front-end which uses the auditory model for both fundamental frequency and voicing estimation, and as the filterbank stage in MFCC extraction, and thereby reduces computation. Applying spectral subtraction to the auditory model parameters improves the spectral envelope estimates needed for clean speech reconstruction. Experiments on the Aurora connected digits database show that the auditory model-based MFCCs give comparable performance to that attained with conventional MFCCs. Speech reconstruction tests reveal that the combination of robust fundamental frequency and voicing estimation with spectral subtraction in the integrated front-end leads to intelligible and relatively noise-free speech.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 48, Issue 6, June 2006, Pages 697–715
نویسندگان
, ,