کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
565998 1452024 2016 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Phone classification via manifold learning based dimensionality reduction algorithms
ترجمه فارسی عنوان
طبقه بندی تلفن با استفاده از الگوریتم های کاهش اندازه گیری یادگیری چندگانه
کلمات کلیدی
طبقه بندی تلفن؛TIMIT؛یادگیری منیفولد؛چارچوب تعبیه گراف؛کاهش الگوی مبتنی بر LDA
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• Phone classification can be improved by using speech continuity constraints.
• The classical LDA-based discrimination can be enriched by using neighborhood constraints.
• Knowledge about speech production helps to understand the neighborhood constraints.

Mechanical limitations imposed on the articulators during speech production lead to a limitation of the intrinsic dimensionality of speech signals. This limitation leads to a specific neighborhood structure of speech sounds when they are represented in a high-dimensional feature space. We investigate whether phone classification can be improved by exploiting this neighborhood structure, by means of extended variants of the conventional Linear Discriminant Analysis (LDA) based on manifold learning.In this extended LDA approach, the within-class and between-class scatter matrices are defined in terms of adjacency graphs. We compare extensions of LDA that use either a full adjacency graph or an adjacency graph defined in the neighborhood of the training observations. In addition, we apply different kernels for weighing the distances in the graphs via different kernels, of which the Adaptive Kernel is proposed in this paper.Experiments with TIMIT show that while LDA algorithms that use the full adjacency graph do not outperform traditional LDA, the algorithms that exploit only local information provide significantly better results than traditional LDA. These improvements are not uniform across different broad phonetic classes, which suggests that the added value of the neighborhood structure is phone class dependent. The structure is represented by locally different densities in the neighborhood of feature vectors that are representative of a specific phone in a specific context.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 76, February 2016, Pages 28–41
نویسندگان
, , , ,