Article ID Journal Published Year Pages File Type
558200 Computer Speech & Language 2016 18 Pages PDF
Abstract

Most state-of-the-art phone classifiers use the same features and decision criteria for all phones, despite the fact that different broad classes are characterized by different manners and place of articulation that result in different acoustic features. This paper uses manifold learning to address structure in the acoustic space. Previous approaches to dimensionality reduction based on manifold learning assumed that the acoustic space can be characterized by a uniform manifold structure. In this paper we relax this assumption by learning different manifold structures for broad phonetic classes. Because all known classifiers make confusions between broad classes, we designed a two-level classifier in which the top level consists of a number of partially overlapping broad classes. Since the resulting classifiers are not statistically independent, we propose a new method for fusing the classifiers. Experimental results show that our two-level classifier obtained slightly better results when broad-class specific manifolds were learned, compared to a uniform manifold. However, the accuracy is still considerably lower than what could be obtained with oracle knowledge about broad class membership. From this we infer that phones do not form compact clusters in acoustic space.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , , ,