A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558755	1451748	2014	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

I-vectors pPCA - PPCA Speaker recognition - شناسایی بلندگو

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

چکیده انگلیسی

• The hybrid FA/PPCA system is presented.
• Two approaches, termed AN and AC, are proposed to speed up the i-vector estimation during testing in a speaker recognition.
• Significant speed ups are obtained for each proposed approach.
• The scalability of the hybrid system is demonstrated using two suitable features (MFCC and MFS).
• Fusion of systems that use AC-type approximation perform similar to that of the corresponding Hybrid system baseline.

A text-independent speaker recognition system using a hybrid Probabilistic Principal Component Analysis (PPCA) and conventional i-vector modeling technique is proposed. In this framework, the total variability space (TVS) is estimated using PPCA while the i-vectors of target speakers and test utterances are extracted using the conventional method. This leads to appreciable decrease in development time, while the time required for training and testing remains unchanged. In this a paper, an algorithmic optimization to the PPCA's EM algorithm is developed. This is observed to provide a speed up of 3.7×. To simplify the testing procedure, two different approximation procedures are proposed to be used in this framework. The first approximation assumes a covariance matrix computed based on the PPCA framework. The second approximation proposes an optimization to avoid inverting the precision matrix of the i-vector. The comparison of time taken by these approximations with the baseline i-vector extraction procedure shows speed gains with some deterioration in performance in terms of the Equal Error Rate (EER). Among the proposed techniques, a best case trade-off is obtained with a speed up of 81.2× with deterioration in performance by 0.7%0.7% in absolute terms. Speaker recognition performances are studied on the telephone conditions of the benchmark NIST SRE 2010 dataset with systems built on the Mel Frequency Cepstral Co-efficient (MFCC) feature. A trade-off in the performance is observed when the proposed approximations are used. The scalability of these trade-offs is tested on the Mel Filterbank Slope (MFS) feature. The trade-offs observed with the approximations are reduced when the two systems are fused.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Digital Signal Processing - Volume 32, September 2014, Pages 137–145

نویسندگان

Srikanth R. Madikeri,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

دسترسی سریع

ارتباط

English Website