Speech bottleneck feature extraction method based on overlapping group lasso sparse deep neural network

Article ID	Journal	Published Year	Pages	File Type
6960530	Speech Communication	2018	14 Pages	PDF

Abstract

In this paper, a novel speech bottleneck feature extraction method based on overlapping group lasso sparse deep neural network is proposed. This method extracts the speech bottleneck features which contain supervised class information and adjacent voice frames related information by changing the architecture of deep neural network (DNN), to improve of the performance speech recognition. Firstly, in order to construct the sparse deep neural network, the sparse regularization of the overlapping group lasso algorithm is added to the single DNN object function which is regarded as the penalty item. Second, the speech bottleneck features can be extracted from the bottleneck layer of the sparse bottleneck DNN (BN-DNN). Finally, the speech bottleneck features are used as the input features of the deep neural network-hidden Markov model (DNN-HMM) speech recognition system. The large vocabulary continuous speech recognition experiment results on the Switchboard database indicate that the algorithm proposed in this paper can extract the speech bottleneck features with priori information, and reduce the word error rate in continuous speech recognition.

Keywords

Automatic speech recognition Deep neural network