کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
241944 501793 2014 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach
چکیده انگلیسی


• The best performing set of MFCC parameters for dysarthric speech was studied.
• A speaker-independent dysarthric ASR model based on ANNs is proposed.
• The ASR systems trained by mel cepstrum with 12 coefficients provided the best accuracy.
• The proposed speaker-independent ASR model provided 68.38% word recognition rate.
• The highest word recognition rate of the speaker-dependent ASR systems was 95%.

Dysarthria is a neurological impairment of controlling the motor speech articulators that compromises the speech signal. Automatic Speech Recognition (ASR) can be very helpful for speakers with dysarthria because the disabled persons are often physically incapacitated. Mel-Frequency Cepstral Coefficients (MFCCs) have been proven to be an appropriate representation of dysarthric speech, but the question of which MFCC-based feature set represents dysarthric acoustic features most effectively has not been answered. Moreover, most of the current dysarthric speech recognisers are either speaker-dependent (SD) or speaker-adaptive (SA), and they perform poorly in terms of generalisability as a speaker-independent (SI) model. First, by comparing the results of 28 dysarthric SD speech recognisers, this study identifies the best-performing set of MFCC parameters, which can represent dysarthric acoustic features to be used in Artificial Neural Network (ANN)-based ASR. Next, this paper studies the application of ANNs as a fixed-length isolated-word SI ASR for individuals who suffer from dysarthria. The results show that the speech recognisers trained by the conventional 12 coefficients MFCC features without the use of delta and acceleration features provided the best accuracy, and the proposed SI ASR recognised the speech of the unforeseen dysarthric evaluation subjects with word recognition rate of 68.38%.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Advanced Engineering Informatics - Volume 28, Issue 1, January 2014, Pages 102–110
نویسندگان
, ,