دانلود رایگان مقاله: شناخت بلندگو از گفتار زمزمه: مطالعه ی آموزشی و کاربرد پیش بینی خطی متغیر زمان

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6960609	1452001	2018	48 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction

ترجمه فارسی عنوان

شناخت بلندگو از گفتار زمزمه: مطالعه ی آموزشی و کاربرد پیش بینی خطی متغیر زمان

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speaker recognition - شناسایی بلندگو Whisper - نجوا Disguise - پوشیدن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

شناخت بلندگو از گفتار زمزمه: مطالعه ی آموزشی و کاربرد پیش بینی خطی متغیر زمان

چکیده انگلیسی

From the available biometric technologies, automatic speaker recognition is one of the most convenient and accessible ones due to abundance of mobile devices equipped with a microphone, allowing users to be authenticated across multiple environments and devices. Speaker recognition also finds use in forensics and surveillance. Due to the acoustic mismatch induced by varied environments and devices of the same speaker, leading to increased number of identification errors, much of the research focuses on compensating for such technology-induced variations, especially using machine learning at the statistical back-end. Another much less studied but at least as detrimental source of acoustic variation, however, arises from mismatched speaking styles induced by the speaker, leading to a substantial performance drop in recognition accuracy. This is a major problem especially in forensics where perpetrators may purposefully disguise their identity by varying their speaking style. We focus on one of the most commonly used ways of disguising one's speaker identity, namely, whispering. We approach the problem of normal-whisper acoustic mismatch compensation from the viewpoint of robust feature extraction. Since whispered speech is intelligible, yet a low-intensity signal and therefore prone to extrinsic distortions, we take advantage of robust, long-term speech analysis methods that utilize slow articulatory movements in speech production. In specific, we address the problem using a novel method, frequency-domain linear prediction with time-varying linear prediction (FDLP-TVLP), which is an extension of the 2-dimensional autoregressive (2DAR) model that allows vocal tract filter parameters to be time-varying, rather than piecewise constant as in classic short-term speech analysis. Our speaker recognition experiments on the whisper subset of the CHAINS corpus indicate that when tested in normal-whisper mismatched conditions, the proposed FDLP-TVLP features improve speaker recognition performance by 7-10% over standard MFCC features in relative terms. We further observe that the proposed FDLP-TVLP features perform better than the FDLP and 2DAR methods for whispered speech.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 99, May 2018, Pages 62-79

نویسندگان

Ville Vestman, Dhananjaya Gowda, Md Sahidullah, Paavo Alku, Tomi Kinnunen,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : شناخت بلندگو از گفتار زمزمه: مطالعه ی آموزشی و کاربرد پیش بینی خطی متغیر زمان

دسترسی سریع

ارتباط

English Website