Refining Sparse Coding Sub-word Unit Inventories with Lattice-constrained Viterbi Training

Article ID	Journal	Published Year	Pages	File Type
485440	Procedia Computer Science	2016	8 Pages	PDF

Abstract

We investigate the application of two novel lattice-constrained Viterbi training strategies to the task of improving sub-word unit (SWU) inventories that were discovered using an unsupervised sparse coding approach. The automatic determination of these SWUs remain a critical and unresolved obstacle to the development of ASR for under-resourced languages. The first lattice-constrained training strategy attempts to jointly learn a bigram SWU language model along with the evolving SWU inventory. We find that this substantially increases correspondence with expert-defined reference phonemes on the TIMIT dataset, but does little to improve pronunciation consistency. The second approach attempts to jointly infer an SWU pronunciation model for each word in the training vocabulary, and to constrain transcription using these models. We find that this lightly supervised approach again substantially increases correspondence with the reference phonemes, and in this case also improves pronunciation consistency.

Keywords

Segmentation Clustering Sparse coding