کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567169 876053 2013 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
In-set/out-of-set speaker recognition in sustained acoustic scenarios using sparse data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
In-set/out-of-set speaker recognition in sustained acoustic scenarios using sparse data
چکیده انگلیسی

This study addresses the problem of identifying in-set versus out-of-set speakers in noise for limited train/test durations in situations where rapid detection and tracking is required. The objective is to form a decision as to whether the current input speaker is accepted as a member of an enrolled in-set group or rejected as an outside speaker. A new scoring algorithm that combines log likelihood scores across an energy-frequency grid is developed where high-energy speaker dependent frames are fused with weighted scores from low-energy noise dependent frames. By leveraging the balance between the speaker versus background noise environment, it is possible to realize an improvement in overall equal error rate performance. Using speakers from the TIMIT database with 5 s of train and 2 s of test, the average optimum relative EER performance improvement for the proposed full selective leveraging approach is +31.6%. The optimum relative EER performance improvement using 10 s of NIST SRE-2008 is +10.8% using the proposed approach. The results confirm that for situations in which the background environment type remains constant between train and test, an in-set/out-of-set speaker recognition system that takes advantage of information gathered from the environmental noise can be formulated which realizes significant improvement when only extremely limited amounts of train/test data is available.


► Problem of in-set versus out-of-set speaker (IS/OoS) recognition in noisy conditions is addressed.
► Focused on limited train/test durations (2–8 s); requires speaker environment to be consistent.
► IS/OoS based on leveraging ID performance for both speaker and environmental spaces.
► Speaker/noise leveraging achieves relative +31.6% EER improvement for TIMIT.
► Speaker/noise leveraging achieves relative +10.8% EER improvement for NIST SRE.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 55, Issue 6, July 2013, Pages 769–781
نویسندگان
, , ,