I-vector based speaker recognition using advanced channel compensation techniques

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558291	874892	2014	20 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speaker verification - تأیید بلندگو LDA - تخصیص پنهان دیریکله i-vector - من بردار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

I-vector based speaker recognition using advanced channel compensation techniques

چکیده انگلیسی

• WMMC and SN-WMMC approaches were introduced to i-vector system.
• WLDA and SN-WLDA approaches were also introduced to i-vector system.
• SN-WLDA shows significant improvement over baseline approach.
• SN-WLDA projected GPLDA also shows improvement over standard GPLDA system.

This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA) and (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, SN-WLDA, for NIST 2008 interview/telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 28, Issue 1, January 2014, Pages 121–140

نویسندگان

Ahilan Kanagasundaram, David Dean, Sridha Sridharan, Mitchell McLaren, Robbie Vogt,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

I-vector based speaker recognition using advanced channel compensation techniques

دسترسی سریع

ارتباط

English Website