کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558735 1451663 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multiband vocal dysperiodicities analysis using empirical mode decomposition in the log-spectral domain
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Multiband vocal dysperiodicities analysis using empirical mode decomposition in the log-spectral domain
چکیده انگلیسی


• EMD algorithm is used to decompose the log-magnitude spectrum of the speech signal.
• EMD-based HNR is applied for disordered voices analysis.
• Multiband analysis is proposed to predict scores of the perceived hoarseness.
• Experimental results show the effectiveness of the proposed approach.

In this paper, empirical mode decomposition (EMD) is proposed as an alternative to decompose the log magnitude spectrum of the speech signal into its harmonic, envelope and noise components. The acoustic measure named harmonic-to-noise ratio (HNR) is used to summarize the degree of disturbance in the speech signal and consequently to evaluate the overall quality of the disordered voices produced by dysphonic speakers.Most approaches for HNR estimation have in common to involve the isolation of individual speech cycles or pseudo-harmonics/rhamonics in speech spectrum/cepstrum; however, this isolation cannot be carried out reliably in speech produced by severely hoarse speakers and may result in inaccurate HNR estimation. The EMD-based approach used in this study incorporates an appropriate procedure that estimates automatically the thresholds used by the clustering algorithm without knowledge of the fundamental frequency. The frequency range of the harmonic and noise components is divided into ten equally spaced intervals and the harmonic-to-noise ratios (HNRs) within each interval are used as independent variables to summarize the amount of perceived hoarseness.The proposed method is evaluated on a corpus comprising 251 normophonic and dysphonic speakers. Multiple correlation analysis carried out on HNRs from the different frequency bands shows that multi-band analysis based on empirical mode decomposition results in statistically significantly higher correlation of predicted scores with scores of perceived hoarseness over full-band analysis.Principal component analysis is carried out on the HNR measures obtained in the ten frequency bands. More than 97% of the total variance is explained by the first two principal components, PC1 and PC2. Experimental results show that the first principal component is interpretable in terms of the degree of the severity of hoarseness whereas the second principal component indicates whether the voice is high-pitched or low-pitched. It is shown that the first two principal components result in a high predictability of hoarseness scores.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Biomedical Signal Processing and Control - Volume 17, March 2015, Pages 11–20
نویسندگان
, , ,