کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566064 875922 2012 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition
چکیده انگلیسی

Standard Mel frequency cepstrum coefficient (MFCC) computation technique utilizes discrete cosine transform (DCT) for decorrelating log energies of filter bank output. The use of DCT is reasonable here as the covariance matrix of Mel filter bank log energy (MFLE) can be compared with that of highly correlated Markov-I process. This full-band based MFCC computation technique where each of the filter bank output has contribution to all coefficients, has two main disadvantages. First, the covariance matrix of the log energies does not exactly follow Markov-I property. Second, full-band based MFCC feature gets severely degraded when speech signal is corrupted with narrow-band channel noise, though few filter bank outputs may remain unaffected. In this work, we have studied a class of linear transformation techniques based on block wise transformation of MFLE which effectively decorrelate the filter bank log energies and also capture speech information in an efficient manner. A thorough study has been carried out on the block based transformation approach by investigating a new partitioning technique that highlights associated advantages. This article also reports a novel feature extraction scheme which captures complementary information to wide band information; that otherwise remains undetected by standard MFCC and proposed block transform (BT) techniques. The proposed features are evaluated on NIST SRE databases using Gaussian mixture model-universal background model (GMM-UBM) based speaker recognition system. We have obtained significant performance improvement over baseline features for both matched and mismatched condition, also for standard and narrow-band noises. The proposed method achieves significant performance improvement in presence of narrow-band noise when clubbed with missing feature theory based score computation scheme.


► A class of block based MFCC computation techniques are investigated.
► Formant specific subband partitioning scheme is shown to be more efficient.
► A technique for computing relative subband information is proposed.
► Multiple systems have been successfully fused to improve performance.
► The proposed system is more robust than MFCC in presence of noise.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 4, May 2012, Pages 543–565
نویسندگان
, ,