Article ID Journal Published Year Pages File Type
6961191 Speech Communication 2015 11 Pages PDF
Abstract
A challenging research problem which has received limited attention in the speech research community is whisper-island detection. Effective whisper island, or VECP-Vocal Effort Change Point, detection is the first step needed to ensure the engagement of effective subsequent speech processing steps to address whisper. In this study, we first propose an improved entropy-based feature from a previous study which is integrated within a model-less whisper-island detection algorithm. The improved 3-D WhID feature shows better discrimination properties between whisper and neutral speech, resulting in a 0.00% MDR (miss detection rate), lower FAR (false alarm rate), MMR (mismatch rate) and collectively a reduced MES (multi-error score). With improved VECP detection results and no need for a prior trained GMM, the BIC-based vocal effort clustering algorithm attains a 100% detection rate of whisper-islands. In this study, a more challenging task of distant whisper-island detection is also addressed using a proposed frame-based vocal effort likelihood space modeling algorithm (model-base). A corpus named UT-VE-III consisting of spontaneous and read whisper embedded neutral speech using a microphone array from various distances in a real-world conference room is developed. For the whisper embedded neutral speech of UT-VE-III at 1-m, 3-m and 5-m distance using a Lavalier microphone and distant microphone, the proposed algorithm sustains consistent performance for VECP detection and whisper classification rates.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,