Article ID Journal Published Year Pages File Type
566007 Speech Communication 2016 16 Pages PDF
Abstract

•Study on the correlation between ASR accuracy and reverberant acoustic conditions.•Experiments involve a large variety of simulated and measured room impulse responses.•The best duration of early arrivals is determined experimentally.•The approach is applied to data contamination for acoustic model training and model selection.•A large vocabulary recognition task (WSJ) is considered using both GMM and DNN.

This work presents an experimental analysis of distant-talking speech recognition in a variety of reverberant conditions, correlating ASR performance to a compact representation of the propagation channel (i.e., the room impulse response).It is well known that reverberation and background noise degrade speech recognition performance, but few studies have investigated the relation between room impulse responses and recognition rates in a comprehensive manner. In particular, we show how the ASR accuracy is related to features derived from the structure of the early arrivals and the reverberation tail. A representation based on the combination of few parameters is hence proposed, analysing the impact of reverberation on different speech recognition tasks. Possible applications of the derived measure are in data contamination for acoustic modeling where this feature can be employed either to select the most suitable model for a given acoustic condition or to define the subset of room impulse responses to be used for the creation of partially matched reverberant models. Recognition results using different back-end solutions (GMM, DNN) on data generated with the image method and with real impulse responses validate the effectiveness of the approach.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,