Comparison of performance with voiced and whispered speech in word recognition and mean-formant-frequency discrimination

Article ID	Journal	Published Year	Pages	File Type
567435	Speech Communication	2012	16 Pages	PDF

Abstract

There has recently been a series of studies concerning the interaction of glottal pulse rate (GPR) and mean-formant-frequency (MFF) in the perception of speaker characteristics and speech recognition. This paper extends the research by comparing the recognition and discrimination performance achieved with voiced words to that achieved with whispered words. The recognition experiment shows that performance with whispered words is slightly worse than with voiced words at all MFFs when the GPR of the voiced words is in the middle of the normal range. But, as GPR decreases below this range, voiced-word performance decreases and eventually becomes worse than whispered-word performance. The discrimination experiment shows that the just noticeable difference (JND) for MFF is essentially independent of the mode of vocal excitation; the JND is close to 5% for both voiced and voiceless words for all speaker types. The interaction between GPR and VTL is interpreted in terms of the stability of the internal representation of speech which improves with GPR across the range of values used in these experiments.

► We compared recognition and discrimination for scaled versions of voiced and whispered words. ► Voiced words with normal pitch values are more recognizable than whispered words. ► Whispered words are more recognizable than voiced words with low pitch values. ► Threshold for formant-frequency discrimination is the same for voiced and whispered words (5%).