Article ID Journal Published Year Pages File Type
10146103 Pattern Recognition Letters 2018 7 Pages PDF
Abstract
Recent works have reported the successful use of sparse representation (SR) over learned dictionary for speaker verification (SV) task. For large variability practical data, the SR based approaches are noted to produce inconsistent sparse coding. In other words, for the true-target trials, the dominant coefficients in the sparse codes of enrollment and test data happen to involve different atoms of the dictionary. This, in turn, enhances the false rejection rate. In this work, we propose a novel yet simple way to address that problem. The key idea is to exploit the sparse coding of enrollment data in finding the representation of the test data. As the proposed constraint affects the false alarm rate, the multi-offset decimation diversity is introduced to address the same. The combined approach has lower computational complexity yet shown to outperform the existing factor analysis based SV approach when evaluated on a large variability NIST 2012 speaker recognition evaluation dataset.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,