An automatic speech recognition system based on the scene analysis account of auditory perception

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
568940	876494	2007	18 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Robust speech recognition - شناسایی قوی سخنرانی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

An automatic speech recognition system based on the scene analysis account of auditory perception

چکیده انگلیسی

Despite many years of concentrated research, the performance gap between automatic speech recognition (ASR) and human speech recognition (HSR) remains large. The difference between ASR and HSR is particularly evident when considering the response to additive noise. Whereas human performance is remarkably robust, ASR systems are brittle and only operate well within the narrow range of noise conditions for which they were designed. This paper considers how humans may achieve noise robustness. We take the view that robustness is achieved because the human perceptual system treats the problems of speech recognition and sound source separation as being tightly coupled. Taking inspiration from Bregman’s Auditory Scene Analysis account of auditory organisation, we present a speech recognition system which couples these processes by using a combination of primitive and schema-driven processes: first, a set of coherent spectro-temporal fragments is generated by primitive segmentation techniques; then, a decoder based on statistical ASR techniques performs a simultaneous search for the correct background/foreground segmentation and word sequence hypothesis. Mutually supporting solutions to both the source segmentation and speech recognition problems arise as a result. The decoder is tested on a challenging corpus of connected digit strings mixed monaurally at 0 dB and recognition performance is compared with that achieved by listeners using identical data. The results, although preliminary, are encouraging and suggest that techniques which interface ASA and statistical ASR have great potential. The paper concludes with a discussion of future research directions that may further develop this class of perceptually motivated ASR solutions.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 49, Issue 5, May 2007, Pages 384–401

نویسندگان

André Coy, Jon Barker,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

An automatic speech recognition system based on the scene analysis account of auditory perception

دسترسی سریع

ارتباط

English Website