Locally learning heterogeneous manifolds for phonetic classification

Article ID	Journal	Published Year	Pages	File Type
558200	Computer Speech & Language	2016	18 Pages	PDF

Abstract

Most state-of-the-art phone classifiers use the same features and decision criteria for all phones, despite the fact that different broad classes are characterized by different manners and place of articulation that result in different acoustic features. This paper uses manifold learning to address structure in the acoustic space. Previous approaches to dimensionality reduction based on manifold learning assumed that the acoustic space can be characterized by a uniform manifold structure. In this paper we relax this assumption by learning different manifold structures for broad phonetic classes. Because all known classifiers make confusions between broad classes, we designed a two-level classifier in which the top level consists of a number of partially overlapping broad classes. Since the resulting classifiers are not statistically independent, we propose a new method for fusing the classifiers. Experimental results show that our two-level classifier obtained slightly better results when broad-class specific manifolds were learned, compared to a uniform manifold. However, the accuracy is still considerably lower than what could be obtained with oracle knowledge about broad class membership. From this we infer that phones do not form compact clusters in acoustic space.

Keywords

Partial classification Classifier fusion TIMIT Dimensionality reduction Manifold learning