Robust discriminative training against data insufficiency in PLDA-based speaker verification

Article ID	Journal	Published Year	Pages	File Type
558245	Computer Speech & Language	2016	26 Pages	PDF

Abstract

•We address data insufficiency in discriminative PLDA training.•First, we compensate for statistical dependencies in the training data.•Second, we propose three constrained discriminative training schemes.

Probabilistic linear discriminant analysis (PLDA) with i-vectors as features has become one of the state-of-the-art methods in speaker verification. Discriminative training (DT) has proven to be effective for improving PLDA's performance but suffers more from data insufficiency than generative training (GT). In this paper, we achieve robustness against data insufficiency in DT in two ways. First, we compensate for statistical dependencies in the training data by adjusting the weights of the training trials in order for the training loss to be an accurate estimate of the expected loss. Second, we propose three constrained DT schemes, among which the best was a discriminatively trained transformation of the PLDA score function having four parameters. Experiments on the male telephone part of the NIST SRE 2010 confirmed the effectiveness of our proposed techniques. For various number of training speakers, the combination of weight-adjustment and the constrained DT scheme gave between 7% and 19% relative improvements in Cˆllr over GT followed by score calibration. Compared to another baseline, DT of all the parameters of the PLDA score function, the improvements were larger.

Keywords

PLDA Discriminative training Overfitting Speaker verification