Modeling nuisance variabilities with factor analysis for GMM-based audio pattern classification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
557930	874817	2011	18 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speaker - بلندگو Language - زبان SVM - ماشین بردار پشتیبانی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Modeling nuisance variabilities with factor analysis for GMM-based audio pattern classification

چکیده انگلیسی

Audio pattern classification represents a particular statistical classification task and includes, for example, speaker recognition, language recognition, emotion recognition, speech recognition and, recently, video genre classification. The feature being used in all these tasks is generally based on a short-term cepstral representation. The cepstral vectors contain at the same time useful information and nuisance variability, which are difficult to separate in this domain. Recently, in the context of GMM-based recognizers, a novel approach using a Factor Analysis (FA) paradigm has been proposed for decomposing the target model into a useful information component and a session variability component. This approach is called Joint Factor Analysis (JFA), since it models jointly the nuisance variability and the useful information, using the FA statistical method. The JFA approach has even been combined with Support Vector Machines, known for their discriminative power. In this article, we successfully apply this paradigm to three automatic audio processing applications: speaker verification, language recognition and video genre classification. This is done by applying the same process and using the same free software toolkit. We will show that this approach allows for a relative error reduction of over 50% in all the aforementioned audio processing tasks.

Research Highlights▶ Joint Factor Analysis models jointly the useful information and the nuisance variability. ▶ Applied to 3 audio processing applications: speaker verification, language recognition and video genre classification. ▶ Analyzing JFA on GMM–UBM and SVM–UBM modeling. ▶ Same paradigm and same free software framework. ▶ JFA allows a relative error reduction of 50–70%.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 25, Issue 3, July 2011, Pages 481–498

نویسندگان

Driss Matrouf, Florian Verdet, Mickaël Rouvier, Jean-François Bonastre, Georges Linarès,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Modeling nuisance variabilities with factor analysis for GMM-based audio pattern classification

دسترسی سریع

ارتباط

English Website