A hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sources

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558360	874908	2013	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Noise robustness - استحکام سر و صدا Auditory scene analysis - تجزیه و تحلیل صحنه های شنیداری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

A hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sources

چکیده انگلیسی

This paper addresses the problem of speech recognition in reverberant multisource noise conditions using distant binaural microphones. Our scheme employs a two-stage fragment decoding approach inspired by Bregman's account of auditory scene analysis, in which innate primitive grouping ‘rules’ are balanced by the role of learnt schema-driven processes. First, the acoustic mixture is split into local time-frequency fragments of individual sound sources using signal-level primitive grouping cues. Second, statistical models are employed to select fragments belonging to the sound source of interest, and the hypothesis-driven stage simultaneously searches for the most probable speech/background segmentation and the corresponding acoustic model state sequence. The paper reports recent advances in combining adaptive noise floor modelling and binaural localisation cues within this framework. By integrating signal-level grouping cues with acoustic models of the target sound source in a probabilistic framework, the system is able to simultaneously separate and recognise the sound of interest from the mixture, and derive significant recognition performance benefits from different grouping cue estimates despite their inherent unreliability in noisy conditions. Finally, the paper will show that missing data imputation can be applied via fragment decoding to allow reconstruction of a clean spectrogram that can be further processed and used as input to conventional ASR systems. The best performing system achieves an average keyword recognition accuracy of 85.83% on the PASCAL CHiME Challenge task.

► A hearing-inspired approach for noise-robust distant-microphone speech recognition.
► Combines aspects of noise modelling and source separation approaches.
► Integration of spatial cues over spectro-temporal regions provides more reliable location estimates.
► The framework allows reconstruction of clean spectrogram from noisy signals.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 27, Issue 3, May 2013, Pages 820–836

نویسندگان

Ning Ma, Jon Barker, Heidi Christensen, Phil Green,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sources

دسترسی سریع

ارتباط

English Website