Article ID Journal Published Year Pages File Type
455162 Computers & Electrical Engineering 2015 13 Pages PDF
Abstract

Cosine similarity and Probabilistic Linear Discriminant Analysis (PLDA) in i-vector space are two state-of-the-art scoring methods in speaker verification field. While PLDA usually gives better accuracy, Cosine Similarity Scoring (CSS) remains a widely used method due to simplicity and acceptable performance. In this domain, several channel compensation and score normalization methods have been proposed to improve the performance. We investigate non-speaker information in cosine similarity metric and propose a new approach to remove it from the decision making process. I-vectors hold a large amount of non-speaker information such as channel effects, language, and phonetic content. This type of information increases the verification error rate and hence it should be removed from the scoring method. To this end we propose a method that estimates non-speaker information between two i-vectors using the development set and subtracts it from cosine similarity. The results indicate that the proposed method performed better than other implemented methods based on the cosine similarity. Furthermore, in certain cases the performance of this method was better than the PLDA method and when combined with PLDA performance was improved in most cases.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , , ,