Article ID Journal Published Year Pages File Type
565279 Speech Communication 2016 20 Pages PDF
Abstract

•Quality measures were investigated to predict reliability of speaker verification.•Bayesian networks estimate the reliability from the score and quality measures.•Best measures for additive noise: SNR and modulation index.•Best measures for reverberation: UBM log-likelihood.•Novel measures based on vector Taylor series performed well.

Despite the great advances made in the speaker recognition field, like joint factor analysis (JFA) and i-vectors, there are still situations where the quality of the speech signals involved in a speaker verification (SV) trial are not good enough to take reliable decisions. This fact motivated us to investigate speech quality measures that are related to the SV performance. We analyzed measures like signal-to-noise ratio (SNR), modulation index, number of speech frames, jitter, shimmer, or likelihood of the data given the universal background model (UBM), JFA and probabilistic linear discriminant analysis models. Besides, we introduce a novel and promising measure based on the vector Taylor series (VTS) paradigm, used to adapt a clean GMM to noisy speech. We used Bayesian networks to combine these measures and produce a probabilistic reliability measure. We applied it to detect trials badly classified. We trained our Bayesian network on NIST SRE08 distorted with noise and reverberation and evaluated on a distorted version of SRE10. We found that, for noise, the best measures were SNR and modulation index; and for reverberation, the UBM likelihood. VTS based measures performed well for both types of distortions.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , ,