کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
568491 1452020 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature sparsity analysis for i-vector based speaker verification
ترجمه فارسی عنوان
تحلیل sparsity ویژگی برای تأیید سخنران مبتنی بر بردار ـ i
کلمات کلیدی
تایید سخنران ؛ فضای کل عوامل؛ تنوع ویژگی؛ تحلیل آمار تطبیقی مرتبه اول باوم ولش (AFSA)
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• Feature (speaker frames) sparsity can lead to the over-fitting problem of i-vector based speaker verification system.
• A series of designated experiments are performed to verify that feature sparsity makes the training of speaker model get stuck into local maxima.
• An improved algorithm named adaptive first order Baum–Welch statistics analysis (AFSA) is proposed to compensate feature sparsity problem.
• Experimental results show that AFSA can improve the performance of i-vector system especially when the durations of training and test utterances are short.

In recent years, the i-vector based framework has been proven to provide state-of-the-art performance in the speaker verification field. Each utterance is projected onto a total factor space and is represented by a low-dimensional i-vector. However, the degradation of performance in the i-vector space remains problematic and is commonly attributed to channel variability. Most techniques used for the channel compensation of the i-vectors, such as linear discriminant analysis (LDA) or probabilistic linear discriminant analysis (PLDA) aim to compensate for the variabilities caused by channel effects. However, in real-world applications, the duration of enrollment and test utterances by each user (speaker) are always very limited. In this paper, we demonstrate, from both analytical and experimental perspectives, that feature sparsity and imbalance widely exist in short utterances, in which case the conventional i-vector extraction algorithm, based on maximum likelihood estimation (MLE), may lead to over-fitting and decrease the performance of the speaker verification system, especially for short utterances. This prompted us to propose an improved i-vector extraction algorithm, which we term adaptive first-order Baum–Welch statistics analysis (AFSA). This new algorithm suppresses and compensates for the deviation from first-order Baum–Welch statistics caused by feature sparsity and imbalance.We reported results on the male telephone portion of the core trial condition (short2-short3) and other short time trial conditions (short2-10sec and 10sec-10sec) on NIST 2008 Speaker Recognition Evaluations (SREs) dataset. As measured both by Equal Error Rate (EER) and the minimum values of the NIST Detection Cost Function (minDCF), 10%–15% relative improvement is obtained compared to the baseline of traditional i-vector based system.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 80, June 2016, Pages 60–70
نویسندگان
, , , , ,