Cross-covariance-based features for speech classification in film audio

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
523439	868354	2015	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Voice activity detection - تشخیص فعالیت صوتی Speech detection - تشخیص گفتار Binary classification - طبقه بندی دوتایی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Cross-covariance-based features for speech classification in film audio

چکیده انگلیسی

As multimedia becomes the dominant form of entertainment through an ever increasing range of digital formats, there has been a growing interest in obtaining information from entertainment media. Speech is one of the core resources in multimedia, providing a foundation for the extraction of semantic information. Thus, detecting speech is a critical first step for speech-based information retrieval systems. This work focuses on speech detection in one of the dominant forms of entertainment media: feature films. A novel approach for voice activity detection (VAD) in film audio is proposed. The approach uses correlation to analyze associations of Mel Frequency Cepstral Coefficient (MFCC) pairs in speech and non-speech data. This information then drives feature selection for the creation of MFCC cross-covariance feature vectors (MFCC-CCs) which are used to train a random forest classifier to solve a binary speech/non-speech classification problem on audio data from entertainment media. The classifier performance is evaluated on a number of test sets and achieves a classification accuracy of up to 94%. The approach is also compared with state of the art and contemporary VAD algorithms, and demonstrates competitive results.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Visual Languages & Computing - Volume 31, Part B, December 2015, Pages 215–221

نویسندگان

Matt Benatan, Kia Ng,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Cross-covariance-based features for speech classification in film audio

دسترسی سریع

ارتباط

English Website