Feature subset selection for improved native accent identification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
568785	876462	2010	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Linear discriminant analysis, LDA - آنالیز تشخیصی خطی Feature selection - انتخاب ویژگی Language identification - شناسایی زبان Support vector machine - ماشین بردار پشتیبانی Gaussian mixture model - مدل مخلوط Gaussian

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Feature subset selection for improved native accent identification

چکیده انگلیسی

In this paper, we develop methods to identify accents of native speakers. Accent identification differs from other speaker classification tasks because accents may differ in a limited number of phonemes only and moreover the differences can be quite subtle. In this paper, it is shown that in such cases it is essential to select a small subset of discriminative features that can be reliably estimated and at the same time discard non-discriminative and noisy features. For identification purposes a speaker is modeled by a supervector containing the mean values for the features for all phonemes. Initial accent models are obtained as class means from the speaker supervectors. Then feature subset selection is performed by applying either ANOVA (analysis of variance), LDA (linear discriminant analysis), SVM-RFE (support vector machine-recursive feature elimination), or their hybrids, resulting in a reduced dimensionality of the speaker vector and more importantly a significantly enhanced recognition performance. We also compare the performance of GMM, LDA and SVM as classifiers on a full or a reduced feature subset. The methods are tested on a Flemish read speech database with speakers classified in five regions. The difficulty of the task is confirmed by a human listening experiment. We show that a relative improvement of more than 20% in accent recognition rate can be achieved with feature subset selection irrespective of the choice of classifier. We finally show that the construction of speaker-based supervectors significantly enhances results over a reference GMM system that uses the raw feature vectors directly as input, both in text dependent and independent conditions.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 52, Issue 2, February 2010, Pages 83–98

نویسندگان

Tingyao Wu, Jacques Duchateau, Jean-Pierre Martens, Dirk Van Compernolle,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Feature subset selection for improved native accent identification

دسترسی سریع

ارتباط

English Website