کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535328 870341 2014 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Session compensation using binary speech representation for speaker recognition
ترجمه فارسی عنوان
جبران جلسه با استفاده از نمایش سخنرانی باینری برای تشخیص بلندگو؟
کلمات کلیدی
شناسایی بلندگو، جبران متغیر جلسه، خصیصه عصبانیت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We aim to present the power of a new speech representation, the Speaker Binary Key.
• New variant of the within-class scatter matrix for session compensation is proposed.
• Covariance matrix using common attributes contains much more information.
• i-Vector and binary key framework contain complementary information.

Recently, a simple representation of a speech excerpt was proposed, as a binary matrix where each acoustic frame is represented by a binary vector. This new approach relies on the UBM paradigm but shifts the speaker recognition workspace from a continuous probabilistic to a discrete, binary discrete space, allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a more compact representation based on cumulative vectors. A cumulative vector is the sum of a set of frame-based binary vectors. In this space, global information can be exploited to compensate for the effects of session variability. This work is mainly dedicated to this aspect. A new variability compensation method in the cumulative vector space is proposed in order to remove not only the unwanted attributes of session variability but also the common attributes among speakers. This is done by incorporating in the projection matrix the common information to all classes. A specificity selection approach using a mask in the cumulative vector space is also proposed. This aims to reduce the non informative coefficients. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement from 42% to 61%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 49, 1 November 2014, Pages 17–23
نویسندگان
, , , ,