Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4973684	1451682	2017	20 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Mutual information - اطلاعات متقابل Speaker verification - تأیید بلندگو Modulation spectrum - طیف مدولاسیون Whispered speech - گفتار شنیده

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification

چکیده انگلیسی

Whispered speech is a natural speaking style that despite its reduced perceptibility, still contains relevant information regarding the intended message (i.e., intelligibility), as well as the speaker identity and gender. Given the acoustic differences between whispered and normally-phonated speech, however, speech applications trained on the latter but tested with the former exhibit unacceptable performance levels. Within an automated speaker verification task, previous research has shown that i) conventional features (e.g., mel-frequency cepstral coefficients, MFCCs) do not convey sufficient speaker discrimination cues across the two vocal efforts, and ii) multi-condition training, while improving the performance for whispered speech, tends to deteriorate the performance for normal speech. In this paper, we aim to tackle both shortcomings by proposing three innovative features, which when fused at the score level, are shown to result in reliable results for both normal and whispered speech. Overall, relative improvements of 66% and 63% are obtained for whispered and normal speech, respectively, over a baseline system based on MFCCs and multi-condition training.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 45, September 2017, Pages 437-456

نویسندگان

Milton Sarria-Paja, Tiago H. Falk,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification

دسترسی سریع

ارتباط

English Website