Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
558483 | Computer Speech & Language | 2011 | 13 Pages |
We consider the problem of using speech processing to characterize an aggregate of voice data, in contrast to inferences about individual voice cuts. We derive simple turn-taking models from speaker activity detection output on the Switchboard-1 corpus. These can be used to cluster speakers into turn-taking ‘styles.’ Demographic fields and turn-taking behavior prove to be statistically dependent, thus observed speaker activity improves estimates of the demographics of held-out data. Finally, we use turn-taking style to estimate speaker influence.
► Turn-taking study results from speech activity detection on Switchboard-1. ► Speakers can be clustered via turn-taking model likelihoods. ► Turn-taking styles can be used to estimate data set demographics. ► Speaker influence related to speaker degrees.