دانلود رایگان مقاله: I-vectors مبتنی بر زبان شناختی محدود برای تشخیص بلندگو خودکار

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
566000	1452024	2016	21 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Linguistically-constrained formant-based i-vectors for automatic speaker recognition

ترجمه فارسی عنوان

I-vectors مبتنی بر زبان شناختی محدود برای تشخیص بلندگو خودکار

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شناسایی خودکار سخنران؛فرکانس فرمند؛دینامیک فرمن؛سیستم های متناسب با زبان شناختی

Automatic speaker recognition - شناخت خودکار سخنران Formant frequencies - فرکانس فرمند

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

I-vectors مبتنی بر زبان شناختی محدود برای تشخیص بلندگو خودکار

چکیده انگلیسی

• We present an approach to automatic speaker verification through linguistically-constrained i-vector systems based on formant frequencies.
• An analysis of discriminative and calibration properties is presented for every linguistic unit (phones and diphones).
• An analysis of the best-performing units for different speakers reveals remarkable speaker-dependent specificities.
• Different approaches for selection and fusion of different linguistic units are also analysed.
• The fusion of a cepstral-based and formant-based systems obtain improved performance.

This paper presents a large-scale study of the discriminative abilities of formant frequencies for automatic speaker recognition. Exploiting both the static and dynamic information in formant frequencies, we present linguistically-constrained formant-based i-vector systems providing well calibrated likelihood ratios per comparison of the occurrences of the same isolated linguistic units in two given utterances. As a first result, the reported analysis on the discriminative and calibration properties of the different linguistic units provide useful insights, for instance, to forensic phonetic practitioners. Furthermore, it is shown that the set of units which are more discriminative for every speaker vary from speaker to speaker. Secondly, linguistically-constrained systems are combined at score-level through average and logistic regression speaker-independent fusion rules exploiting the different speaker-distinguishing information spread among the different linguistic units. Testing on the English-only trials of the core condition of the NIST 2006 SRE (24,000 voice comparisons of 5 minutes telephone conversations from 517 speakers -219 male and 298 female-), we report equal error rates of 9.57 and 12.89% for male and female speakers respectively, using only formant frequencies as speaker discriminative information. Additionally, when the formant-based system is fused with a cepstral i-vector system, we obtain relative improvements of ∼6% in EER (from 6.54 to 6.13%) and ∼15% in minDCF (from 0.0327 to 0.0279), compared to the cepstral system alone.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 76, February 2016, Pages 61–81

نویسندگان

Javier Franco-Pedroso, Joaquin Gonzalez-Rodriguez,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : I-vectors مبتنی بر زبان شناختی محدود برای تشخیص بلندگو خودکار

دسترسی سریع

ارتباط

English Website