Article ID Journal Published Year Pages File Type
1101815 Journal of Voice 2012 9 Pages PDF
Abstract

SummaryObjectives/HypothesisAutomatic voice evaluation is usually performed on stable sections of sustained vowels, which often cannot capture hoarseness properly. The measures cepstral peak prominence (CPP) and smoothed CPP (CPPS) do not require exact determination of the cycles of fundamental frequency like established perturbation-based measures. They can also be applied to text recordings. In this study, they were compared with perceptual evaluation of voice quality and the German roughness-breathiness-hoarseness (RBH) scheme.Study DesignRetrospective data analysis.MethodsSeventy-three hoarse patients (48.3 ± 16.8 years) uttered the vowel /e/ and read the German version of the text “The North Wind and the Sun”. The text recordings were evaluated perceptually by five speech therapists and physicians according to the RBH scale. The criterion “overall quality” was measured on a 4-point scale and a visual analog scale. For the human-machine correlation, the automatic measures of the Praat program (vowels only) and the “cpps” software were compared with the experts' ratings. The experiments were repeated for speakers with jitter ≤5% or shimmer ≤5% (n = 47).ResultsFor the entire group (n = 73), the best human-machine results for most of the rating criteria were obtained for text-based CPP and CPPS (up to |ρ| = 0.73). For the 47 selected speakers, the correlation was remarkably worse for all measures but still best for text-based CPP and CPPS (|ρ| ≤ 0.50).ConclusionsCepstrum analysis should be performed on a text recording. Then, it outperforms all perturbation-based measures, and it can be a meaningful objective support for perceptual analysis.

Related Topics
Health Sciences Medicine and Dentistry Otorhinolaryngology and Facial Plastic Surgery
Authors
, , , , , ,