دانلود رایگان مقاله: شناسایی بلندگوهای متن با استفاده از برآورد آمار قوی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4977779	1452008	2017	44 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Text-independent speaker identification using robust statistics estimation

ترجمه فارسی عنوان

شناسایی بلندگوهای متن با استفاده از برآورد آمار قوی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شناسایی سخنران مستقل متن مدل مخلوط گاوسی، برآورد آمار دقیق، بهینه سازی محدب،

Convex optimization - بهینه سازی محدب Gaussian mixture models - مدل مخلوط گاوسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

شناسایی بلندگوهای متن با استفاده از برآورد آمار قوی

چکیده انگلیسی

It is well-known that the performance of Gaussian mixture model-based text-independent speaker identification systems deteriorates significantly with the presence of noise and spectral distortion in the training and testing utterances. In this paper, we propose a novel GMM-based speaker identification system based on two robust-statistics estimation methods: the minimum volume ellipsoid method, and the minimum covariance determinant method. Compared to the traditional maximum likelihood estimation method, the proposed methods are less sensitive to outliers in the feature-vector space caused by additive noise and spectral distortion. Moreover, in the testing phase, we propose a simple distance metric to be used for comparing the unknown testing utterance against the speakers' models. Furthermore, we derive a more robust version of the i-vector extractor, named robust i-vector, which utilizes our proposed robust estimation methods for estimating the parameters of the base universal background model. The proposed classification system has been applied to the NIST 2000 speaker recognition evaluation and the COSINE database. It has also been compared against state-of-the-art techniques such as the GMM/UBM method, the super-vectors method, and the i-vector methods. Experimental results show that the proposed classification system provides up to 16% relative improvement in the identification performance over the i-vector methods for short utterances in the NIST 2000 database and up to 8% when the utterances of the NIST 2000 database are contaminated by different types of artificial noise for signal-to-noise ratio ranging from 0 to 20Â dB. For the COSINE database, the robust i-vector estimation provides an absolute improvement of up to 8%. Finally, the real time factor of the proposed distance metric for testing is 55% higher than the RT of the regular ML scoring.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 92, September 2017, Pages 52-63

نویسندگان

Moataz El Ayadi, Abdel-Karim S.O. Hassan, Ahmed Abdel-Naby, Omar A. Elgendy,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : شناسایی بلندگوهای متن با استفاده از برآورد آمار قوی

دسترسی سریع

ارتباط

English Website