Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567537	876100	2011	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Mutual information - اطلاعات متقابل Speech discrimination - تبعیض گفتاری Higher order singular value decomposition - تقسیم ارزش منحصر به فرد مرتبه بالاتر Modulation spectrum - طیف مدولاسیون

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features

چکیده انگلیسی

In audio content analysis, the discrimination of speech and non-speech is the first processing step before speaker segmentation and recognition, or speech transcription. Speech/non-speech segmentation algorithms usually consist of a frame-based scoring phase using MFCC features, combined with a smoothing phase. In this paper, a content based speech discrimination algorithm is designed to exploit long-term information inherent in modulation spectrum. In order to address the varying degrees of redundancy and discriminative power of the acoustic and modulation frequency subspaces, we first employ a generalization of SVD to tensors (Higher Order SVD) to reduce dimensions. Projection of modulation spectral features on the principal axes with the higher energy in each subspace results in a compact set of features with minimum redundancy. We further estimate the relevance of these projections to speech discrimination based on mutual information to the target class. This system is built upon a segment-based SVM classifier in order to recognize the presence of voice activity in audio signal. Detection experiments using Greek and US English broadcast news data composed of many speakers in various acoustic conditions suggest that the system provides complementary information to state-of-the-art mel-cepstral features.

Figure optionsDownload as PowerPoint slideResearch highlights
► Speech discrimination exploits long-term information inherent in modulation spectrum.
► Higher order singular value decomposition first reduces redundancy of features.
► Relevance of features to the task is estimated based on mutual information.
► The system can provide complementary information to the mel-cepstral features.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 53, Issue 5, May–June 2011, Pages 726–735

نویسندگان

Maria Markaki, Yannis Stylianou,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features

دسترسی سریع

ارتباط

English Website