Article ID Journal Published Year Pages File Type
6961037 Speech Communication 2015 15 Pages PDF
Abstract
Recently, systems combining i-vector and probabilistic linear discriminant analysis (PLDA) have become one of the state-of-the-art methods in text-independent speaker verification. The training data of a PLDA model is often collected from a large, diverse population. However, including irrelevant or noisy training data may deteriorate the verification performance. In this paper, we first show that data selection using k-NN improves the speaker verification performance. We then present a robust way of selecting k based on the local distance-based outlier factor (LDOF). We call this method flexible k-NN (fk-NN). We conduct experiments on male and female trials of several telephone conditions of the NIST 2006, 2008, 2010 and 2012 Speaker Recognition Evaluations (SRE). By using fk-NN, we discard a substantial amount of irrelevant or noisy training data without depending on tuning k, and achieve significant performance improvements on the NIST SRE sets.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,