کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4973666 1451682 2017 36 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification
چکیده انگلیسی
Although i-vectors together with probabilistic LDA (PLDA) have achieved a great success in speaker verification, how to suppress the undesirable effects caused by the variability in utterance length and background noise level is still a challenge. This paper aims to improve the robustness of i-vector based speaker verification systems by compensating for the utterance-length variability and noise-level variability. Inspired by the recent findings that noise-level variability can be modeled by a signal-to-noise ratio (SNR) subspace and that duration variability can be modeled as additive noise in the i-vector space, we propose to add an SNR factor and a duration factor to the PLDA model. In this framework, we assume that i-vectors derived from utterances with comparable durations share similar duration-specific information and that i-vectors extracted from utterances within a narrow SNR range have similar SNR-specific information. Based on these assumptions, an i-vector can be represented as a linear combination of four components: speaker, SNR, duration, and channel. A variational Bayes algorithm is developed to infer this latent variable model via a discriminative subspace training procedure. In the testing stage, different variabilities are compensated for when computing the likelihood ratio. Experiments on Common Conditions 1 and 4 in NIST 2012 SRE show that the proposed model outperforms the conventional PLDA and SNR-invariant PLDA. Results also show that the proposed model performs better than the uncertainty-propagation PLDA (UP-PLDA) for long test utterances.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 45, September 2017, Pages 83-103
نویسندگان
, , , ,