کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4977835 1452014 2017 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Epoch extraction from emotional speech using single frequency filtering approach
ترجمه فارسی عنوان
استخراج عصر از سخنرانی احساسی با استفاده از یک روش فیلتر کردن تک فرکانس
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


- An approach, which exploits the nature of impulse-like excitation in the speech signal is explored.
- Three properties of impulse are used to extract impulse-like discontinuities, namely: (a)An impulse in the time domain results in flat spectrum in the frequency domain.(b)An impulse event in the time domain is a high energy event, i.e., the strength of the impulse is substantially larger than the strengths of the samples in the neighborhood.(c)The effect of impulse is spread at all the frequencies.
- We demonstrate the benefit of exploiting the impulse-like events in terms of epoch identification accuracy.
- The results of the proposed methods for emotional speech is comparable to the results for neutral speech, and is better than the results from many of the standard methods.

Epochs are instants of significant excitation of the vocal tract system during production of voiced speech. Existing methods for epoch extraction provide good results on neutral speech. But effectiveness of these methods has not been examined carefully for analysis of emotional speech, where the emotion characteristics are embedded mainly in the source component of the signal. Performance of the state-of-art epoch extraction methods on emotional speech data may be affected due to large variations in the pitch period. An approach, which exploits the nature of impulse-like excitation in the speech signal, instead of the pitch period information, is explored in this paper. The approach uses single frequency filtering (SFF) analysis of speech signals, which provides the high temporal resolution of some features of excitation source (such as impulse-like events) and high spectral resolution for some features of spectrum (such as harmonics and resonances). The Berlin emotional speech database (EMO-DB), which contains the simultaneous electroglottograph (EGG) recordings is used as the ground truth. For comparison, several epoch extraction methods are evaluated in terms of both reliability and accuracy measures for six different emotion categories and neutral speech. The results indicate that the performance of the proposed SFF-based methods for emotional speech is comparable to the results for neutral speech, and is better than the results from many of the standard methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 86, February 2017, Pages 52-63
نویسندگان
, ,