Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
568867	876478	2007	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speech encoding Automatic speech recognition - تشخیص گفتار خودکار Auditory nerve - عصب شنوایی Auditory model - مدل شنیداری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition

چکیده انگلیسی

Our notion of how speech is processed is still very much dominated by von Helmholtz’s theory of hearing. He deduced that the human inner ear decomposes the spectrum of sound signals. However, physiological recordings of auditory nerve fibers (ANF) showed that the rate-place code, which is thought to transmit spectral information to the brain, is at least complemented by a temporal code. In our paper we challenge the rate-place code using a complex but realistic scenario: speech in noise. We used a detailed model of human auditory processing that closely replicates key aspects of auditory nerve spike trains. We performed quantitative evaluations of coding strategies using standard automatic speech recognition (ASR) tools. Our test data was spoken letters of the whole English alphabet from a variety of speakers, with and without background noise. We evaluated a purely rate-place-based encoding strategy, a temporal strategy based on interspike intervals, and a combination thereof. The results suggest that as few as 4% of the total number of ANFs would be sufficient to code speech information in a rate-place fashion. Rate-place coding performed its best for speech in clean conditions at normal sound level, but broke down at higher-than-normal levels, and failed dramatically in noise at high levels. Low-spontaneous rate fibers improved the rate-place code, mainly for vowels and at higher-than-normal levels. At high speech levels, and in particular in the presence of background noise, combining rate-place coding with the temporal coding strategy greatly improved recognition accuracy. We therefore conclude that the human auditory system does not rely on a rate-place code alone but requires the abundance of fibers for precise temporal coding.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 49, Issue 12, December 2007, Pages 917–932

نویسندگان

Marcus Holmberg, David Gelbart, Werner Hemmert,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Speech encoding in a model of peripheral auditory processing: Quantitative assessment by means of automatic speech recognition

دسترسی سریع

ارتباط

English Website