کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
454075 | 695093 | 2012 | 17 صفحه PDF | دانلود رایگان |

This paper introduces a nonlinear function into the frequency spectrum that improves the detection of vowels, diphthongs, and semivowels within the speech signal. The lower efficiency of consonant detection was solved by implementing the hangover and hangbefore criteria. This paper presents a procedure for faster definition of those optimal constants used by hangover and hangbefore criteria. A nonlinearly changed frequency spectrum is used in the proposed GMM (Gaussian Mixture Model) based VAD (Voice Activity Detection) algorithm. Comparative tests between the proposed VAD algorithm and seven other VAD algorithms were made on the Aurora 2 database. The experiments were based on frame error detection and on speech recognition performance for two types of acoustic training modes (multi-condition and clean only). The lowest average percentage of frame errors was obtained by the proposed VAD algorithm, which also achieved positive improvement in the speech recognition performance for both types of acoustic training modes.
Figure optionsDownload as PowerPoint slideHighlights
► We use Gaussian mixture models based voice activity detection algorithm.
► We introduce nonlinear function defined on minimum and maximum statistics.
► Nonlinear function improves the detection of vowels, diphthongs, and semivowels.
► Nonlinear function reduces detection of consonants.
► Implementation of hangover and hangbefore criteria solves detection of consonants.
Journal: Computers & Electrical Engineering - Volume 38, Issue 6, November 2012, Pages 1820–1836