کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
406880 678114 2014 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach
ترجمه فارسی عنوان
تشخیص مستقیم گفتار مبتنی بر فرکانس زمان واقعی با استفاده از شبکه های عصبی مصنوعی چند شبکه ای: چند رویکرد روش چند فراگیری
کلمات کلیدی
نمایش چندگانه فراگیران، تشخیص گفتار خودکار شبکه های عصبی مصنوعی، استحکام نویز، سر و صدا مبتنی بر فرکانس
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• A noise-robust Multi-Networks Speech Recogniser model is provided.
• The proposed model is based on Multi-Nets Artificial Neural Networks.
• The proposed model was verified using unforeseen testing data infected with additive noise.
• A detailed performance comparison is made between the proposed model and monolithic ANN-based ASRs.
• The proposed recogniser improved the recognition rate by up to 20.14% in compare with MVSL ANN-based ASRs.

Automatic Speech Recognition (ASR) is a technology for identifying uttered word(s) represented as an acoustic signal. However, one of the important aspects of a noise-robust ASR system is its ability to recognise speech accurately in noisy conditions. This paper studies the applications of Multi-Nets Artificial Neural Networks (M-N ANNs), a realisation of multiple-views multiple-learners approach, as Multi-Networks Speech Recognisers (M-NSRs) in providing a real-time, frequency-based noise-robust ASR model. M-NSRs define speech features associated with each word as a different view and apply a standalone ANN as one of the learners to approximate that view; meanwhile, multiple-views single-learner (MVSL) ANN-based speech recognisers employ only one ANN to memorise the features of the entire vocabulary. In this research, an M-NSR was provided and evaluated using unforeseen test data that were affected by white, brown, and pink noises; more specifically, 27 experiments were conducted on noisy speech to measure the accuracy and recognition rate of the proposed model. Furthermore, the results of the M-NSR were compared in detail with an MVSL ANN-based ASR system. The M-NSR recorded an improved average recognition rate by up to 20.14% when it was given the test data infected with noise in our experiments. It is shown that the M-NSR with higher degree of generalisability can handle frequency-based noise because it has higher recognition rate than the previous model under noisy conditions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 129, 10 April 2014, Pages 199–207
نویسندگان
, ,