کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566347 875967 2008 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Multisensory processing for speech enhancement and magnitude-normalized spectra for speech modeling
چکیده انگلیسی

In this paper, we tackle the problem of speech enhancement from two fronts: speech modeling and multisensory input. We present a new speech model based on statistics of magnitude-normalized complex spectra of speech signals. By performing magnitude normalization, we are able to get rid of huge intra- and inter-speaker variation in speech energy and to build a better speech model with a smaller number of Gaussian components. To deal with real-world problems with multiple noise sources, we propose to use multiple heterogeneous sensors, and in particular, we have developed microphone headsets that combine a conventional air microphone and a bone sensor. The bone sensor makes direct contact with the speaker’s temple (area behind the ear), and captures the vibrations of the bones and skin during the process of vocalization. The signals captured by the bone microphone, though distorted, contain useful audio information, especially in the low frequency range, and more importantly, they are very robust to external noise sources (stationary or not). By fusing the bone channel signals with the air microphone signals, much improved speech signals have been obtained.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 50, Issue 3, March 2008, Pages 228–243
نویسندگان
, , , ,