Pathological voice detection and binary classification using MPEG-7 audio features

Article ID	Journal	Published Year	Pages	File Type
558120	Biomedical Signal Processing and Control	2014	9 Pages	PDF

Abstract

•MPEG-7 audio descriptors are used for voice pathology detection/classification.•Fisher ratio is applied to find the most discriminative features.•Using only top five features the detection accuracy reached over 99%.•Those five features obtained over 93% accuracy for binary classification.•Achieved the best detection rate so far in MEEI database (subset).

ObjectivesA pathological voice detection and classification method based on MPEG-7 audio low-level features is proposed in this paper. MPEG-7 features are originally used for multimedia indexing, which includes both video and audio. Indexing is related to event detection, and as pathological voice is a separate event than normal voice, we show that MPEG-7 part-4 audio low-level features can do very well in detecting pathological voices, as well as binary classifying the pathologies.Patients and methodsThe experiments are done on a subset of sustained vowel (“AH”) recordings from healthy and voice pathological subjects, from the Massachusetts Eye and Ear Infirmary (MEEI) database. For classification, support vector machine (SVM) is applied. An optional feature selection method, namely, Fisher discrimination ratio is applied.ResultsThe proposed method with MPEG-7 audio features and SVM classification is evaluated on voice pathology detection, as well as binary pathologies classification. The proposed method is able to achieve an accuracy of 99.994% with a standard deviation of 0.0105% for detecting pathological voices and an accuracy up to 100% for binary pathologies classification.ConclusionMPEG-7 descriptors can reliably be used for automatic voice pathology detection and classification.

Keywords

Voice pathology detection