Combining evidences from magnitude and phase information using VTEO for person recognition using humming

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6951462	1451675	2018	32 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Person recognition - تشخیص شخص Polynomial classifier - طبقه بندی چندجملهای Humming - هومینگ

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Combining evidences from magnitude and phase information using VTEO for person recognition using humming

چکیده انگلیسی

Most of the state-of-the-art speaker recognition system use natural speech signal (i.e., real speech, spontaneous speech or contextual speech) from the subjects. In this paper, recognition of a person is attempted from his or her hum with the help of machines. This kind of application can be useful to design person-dependent Query-by-Humming (QBH) system and hence, plays an important role in music information retrieval (MIR) system. In addition, it can be also useful for other interesting speech technological applications such as human-computer interaction, speech prosody analysis of disordered speech, and speaker forensics. This paper develops new feature extraction technique to exploit perceptually meaningful (due to mel frequency warping to imitate human perception process for hearing) phase spectrum information along with magnitude spectrum information from the hum signal. In particular, the structure of state-of-the-art feature set, namely, Mel Frequency Cepstral Coefficients (MFCCs) is modified to capture the phase spectrum information. In addition, a new energy measure, namely, Variable length Teager Energy Operator (VTEO) is employed to compute subband energies of different time-domain subband signals (i.e., an output of 24 triangular-shaped filters used in the mel filterbank). We refer this proposed feature set as MFCC-VTMP (i.e., mel frequency cepstral coefficients to capture perceptually meaningful magnitude and phase information via VTEO)The polynomial classifier (which is in-principle similar to other discriminatively-trained classifiers such as support vector machine (SVM) with polynomial kernel) is used as the basis for all the experiments. The effectiveness of proposed feature set is evaluated and consistently found to be better than MFCCs feature set for several evaluation factors, such as, comparison with other phase-based features, the order of polynomial classifier, person (speaker) modeling approach (such as, GMM-UBM and i-vector), the dimension of feature vector, robustness under signal degradation conditions, static vs. dynamic features, feature discrimination measures and intersession variability.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 52, November 2018, Pages 225-256

نویسندگان

Hemant A. Patil, Maulik C. Madhavi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Combining evidences from magnitude and phase information using VTEO for person recognition using humming

دسترسی سریع

ارتباط

English Website