کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
567018 | 1452042 | 2014 | 18 صفحه PDF | دانلود رایگان |
• We present a study and an evaluation of various spontaneous speech features.
• We propose a two-level strategy to assign a spontaneity level to each speech segment.
• A spontaneous speech detection system can be used for speaker role detection.
Processing spontaneous speech is one of the many challenges that automatic speech recognition systems have to deal with. The main characteristics of this kind of speech are disfluencies (filled pause, repetition, false start, etc.) and many studies have focused on their detection and correction. Spontaneous speech is defined in opposition to prepared speech, where utterances contain well-formed sentences close to those found in written documents.Acoustic and linguistic features made available by the use of an automatic speech recognition system are proposed to characterize and detect spontaneous speech segments from large audio databases. To better define this notion of spontaneous speech, segments of an 11-hour corpus (French Broadcast News) had been manually labeled according to three classes of spontaneity.Firstly, we present a study of these features. We then propose a two-level strategy to automatically assign a class of spontaneity to each speech segment. The proposed system reaches a 73.0% precision and a 73.5% recall on high spontaneous speech segments, and a 66.8% precision and a 69.6% recall on prepared speech segments.A quantitative study shows that the classes of spontaneity are useful information to characterize the speaker roles. This is confirmed by extending the speech spontaneity characterization approach to build an efficient automatic speaker role recognition system.
Journal: Speech Communication - Volume 56, January 2014, Pages 1–18