کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
557936 874817 2011 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The use of phase in complex spectrum subtraction for robust speech recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
The use of phase in complex spectrum subtraction for robust speech recognition
چکیده انگلیسی

In this paper we propose a new method for utilising phase information by complementing it with traditional magnitude-only spectral subtraction speech enhancement through complex spectrum subtraction (CSS). The proposed approach has the following advantages over traditional magnitude-only spectral subtraction: (a) it introduces complementary information to the enhancement algorithm; (b) it reduces the total number of algorithmic parameters; and (c) is designed for improving clean speech magnitude spectra and is therefore suitable for both automatic speech recognition (ASR) and speech perception applications. Oracle-based ASR experiments verify this approach, showing an average of 20% relative word accuracy improvements when accurate estimates of the phase spectrum are available. Based on sinusoidal analysis and assuming stationarity between observations (which is shown to be better approximated as the frame rate is increased), this paper also proposes a novel method for acquiring the phase information called Phase Estimation via Delay Projection (PEDEP). Further oracle ASR experiments validate the potential for the proposed PEDEP technique in ideal conditions. Realistic implementation of CSS with PEDEP shows performance comparable to state of the art spectral subtraction techniques in a range of 15–20 dB signal-to-noise ratio environments. These results clearly demonstrate the potential for using phase spectra in spectral subtractive enhancement applications, and at the same time highlight the need for deriving more accurate phase estimates in a wider range of noise conditions.

Research highlights▶ Complex spectrum subtraction introduces phase information to spectral subtraction. ▶ Phase Estimation via Delay Projection (PEDEP) proposed for noise or speech phase. ▶ Approach shows average 20% relative word accuracy improvement in ideal conditions. ▶ Practical implementation comparable to state of the art in high SNR environments.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 25, Issue 3, July 2011, Pages 585–600
نویسندگان
, , ,