Article ID Journal Published Year Pages File Type
6951449 Computer Speech & Language 2018 18 Pages PDF
Abstract
In this paper we explore the automatic prediction of speaker appeal from recordings of political speech. The database used contains recordings of a single speaker in a wide range of situations (interview, election rally etc.) which has been annotated for six speaker traits: boring; charismatic; enthusiastic; inspiring; likeable; and persuasive. The aim of this study is to predict these ratings using acoustic features of the speech. We offer three key contributions in this paper. Firstly, we explore the effect of acoustic environment on the perception of speaker ability. We find significant biases in the perception of all six traits, with interview speech being consistently rated as less appealing, and election rally speech as more appealing. In our second contribution, we attempt to exploit this bias by modelling speech from each situation separately, which gives a significant improvement in classification performance. Finally, the database covers 7 years. Thus, our third contribution is an analysis of the variance in both annotations and acoustic features over time to uncover temporal trends in speaker appeal. We find significant trends which show a decline in the speaker's prosodic activity over time, which mirror a decline in the perception of speaker appeal as measured by the database annotations.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,