Transfer learning for PLDA-based speaker verification

Article ID	Journal	Published Year	Pages	File Type
4977782	Speech Communication	2017	12 Pages	PDF

Abstract

Currently, the majority of the state-of-the-art speaker verification systems are based on i-vector and PLDA; however, PLDA requires a huge volume of development data from multiple different speakers. This makes it difficult to learn PLDA parameters for a domain with scarce data. In this paper, we study and extend an effective transfer learning method based on Bayesian joint probability, in which the Kullback-Leibler (KL) divergence between the source domain and the target domain is added as a regularization factor. This method utilizes the development data from the source domain to help find the optimal PLDA parameters for the target domain. Specifically, speaker verification of short utterances can be viewed as a task in the domain with a limited amount of long utterances. Therefore, transfer learning for PLDA can also be adopted to learn discriminative information from other domains with a great deal of long utterances. Experimental results based on the NIST SRE and Switchboard corpus demonstrate that the proposed method offers a significant performance gain when compared with the traditional PLDA.

Keywords

PLDA Transfer learning Speaker verification