کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
411950 | 679598 | 2015 | 10 صفحه PDF | دانلود رایگان |
This paper builds on the technique of feature extraction from the spectrogram image of sound signals for automatic sound recognition. The spectrogram image is divided into blocks and statistical distributions are extracted from each block as features. However, when compared to related work, we reduce the dimensionality of the feature vector using mean and standard deviation values along the row and column of the blocks without compromising the classification accuracy. We demonstrate the technique in an audio surveillance application and evaluate the performance using four common multiclass support vector machine (SVM) classification techniques, one-against-all, one-against-one, decision directed acyclic graph, and adaptive directed acyclic graph. Experimentation was carried out using an audio database with 10 sound classes, each containing multiple subclasses with intraclass diversity and interclass similarity in terms of signal properties. Under noisy conditions, the proposed reduced spectrogram image feature (RSIF) produced significantly better classification accuracy than the conventional log compressed mel-frequency cepstral coefficients (MFCCs) and marginally better classification accuracy than linear MFCCs, which does not utilize any compression. The linear spectrogram image representations for feature extraction and the one-against-all multiclass SVM classification method were found to be the most noise robust. In addition, significantly improved results were obtained under noisy conditions when the RSIF is combined with linear MFCCs.
Journal: Neurocomputing - Volume 158, 22 June 2015, Pages 90–99