Exploring similarity-based classification of larynx disorders from human voice

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
566045	875914	2012	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

GMM Laryngeal disorder Pathological voice - صدای آسیب شناسی Mel-frequency cepstral coefficients - ضرایب cepstral ملودی Earth Mover?s Distance - فاصله زمین حرکت دهنده Kullback?Leibler divergence - کولبک، واگن Leibler

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Exploring similarity-based classification of larynx disorders from human voice

چکیده انگلیسی

In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient’s voice, are further approximated, using various strategies (sampling, averaging, and clustering by Gaussian mixture model). The effectiveness of similarity-based classification techniques in categorizing such pre-processed data into normal voice, nodular, and diffuse vocal fold lesion classes is explored and schemes to combine binary decisions of support vector machines (SVMs) are evaluated. Most practiced RBF kernel was compared to several constructed custom kernels: (i) a sequence kernel, defined over a pair of matrices, rather than over a pair of vectors and calculating the kernelized principal angle (KPA) between subspaces; (ii) a simple supervector kernel using only means of patient’s GMM; (iii) two distance kernels, specifically tailored to exploit covariance matrices of GMM and using the approximation of the Kullback–Leibler divergence from the Monte-Carlo sampling (KL-MCS), and the Kullback–Leibler divergence combined with the Earth mover’s distance (KL-EMD) as similarity metrics.The sequence kernel and the distance kernels both outperformed the popular RBF kernel, but the difference is statistically significant only in the distance kernels case. When tested on voice recordings, collected from 410 subjects (130 normal voice, 140 diffuse, and 140 nodular vocal fold lesions), the KL-MCS kernel, using GMM with full covariance matrices, and the KL-EMD kernel, using GMM with diagonal covariance matrices, provided the best overall performance. In most cases, SVM reached higher accuracy than least squares SVM, except for common binary classification using distance kernels. The results indicate that features, modeled with GMM, and kernel methods, exploiting this information, is an interesting fusion of generative (probabilistic) and discriminative (hyperplane) models for similarity-based classification.

► Fusion of generative and discriminative models for similarity-based classification of larynx disorders from human voice is explored.
► We suggest two novel distance kernels for GMM fusion with SVM.
► Such fusion is compared to the usual SVM-based classification.
► The researched distance kernels provided best overall performance.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 5, June 2012, Pages 601–610

نویسندگان

Evaldas Vaiciukynas, Antanas Verikas, Adas Gelzinis, Marija Bacauskiene, Virgilijus Uloza,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Exploring similarity-based classification of larynx disorders from human voice

دسترسی سریع

ارتباط

English Website