Ensemble environment modeling using affine transform group

Article ID	Journal	Published Year	Pages	File Type
566772	Speech Communication	2015	14 Pages	PDF

Abstract

•ESSEM allowed unsupervised model adaptation with only one testing utterance.•ATG mapping function enhances ESSEM framework with sufficient adaptation statistics.•Adaptation performance determined by appropriate selection of ensemble models.•Over-fitting issues handled with optimization processes: MAP, MS, and CS.•Imposing constraints on ATG allows flexible extension to LR, BC, LCB, LC, and BF.

The ensemble speaker and speaking environment modeling (ESSEM) framework was designed to provide online optimization for enhancing workable systems under real-world conditions. In the ESSEM framework, ensemble models are built in the offline phase to characterize specific environments based on local statistics prepared from those particular conditions. In the online phase, a mapping function is computed based on the incoming testing data to perform model adaptation. Previous studies utilized linear combination (LC) and linear combination with a correction bias (LCB) as simple mapping functions that only apply one weighting coefficient on each model. In order to better utilize the ensemble models, this study presents a generalized affine transform group (ATG) mapping function for the ESSEM framework. Although ATG characterizes unknown testing conditions more precisely using a larger amount of parameters, over-fitting issues occur when the available adaptation data is especially limited. This study handles over-fitting issues with three optimization processes: maximum a posteriori (MAP) criterion, model selection (MS), and cohort selection (CS). Experimental results showed that ATG along with the three optimization processes enabled the ESSEM framework to allow unsupervised model adaptation using only one utterance to provide consistent performance improvements.

Keywords

Model selection Maximum a posteriori Prior knowledge Environment modeling Ensemble modeling