Applying data mining techniques to corpus based prosodic modeling

Article ID	Journal	Published Year	Pages	File Type
567756	Speech Communication	2007	17 Pages	PDF

Abstract

This article presents MEMOInt, a methodology to automatically extract the intonation patterns which characterize a given corpus, with applications in text-to-speech systems. Easy to understand information about the form of the characteristic patterns found in the corpus can be obtained from MEMOint in a way which allows easy comparison with other proposals. A visual representation of the relationship between the set of prosodic features which could have been selected to label the corpus and the intonation contour patterns is also easy to obtain. The particular function-form correspondence associated to the given corpus is represented by means of a list of dictionaries of classes of parameterized F0 patterns, where the access key is given by a sequence of prosodic features. MEMOInt can also be used to obtain valuable information about the relative impact of the use of different parameterization techniques of F0 contours or of different types of intonation units and information about the relevance of different prosodic features. The methodology has been specifically designed to provide a successful strategy to solve the data sparseness problem which usually affects corpora as a consequence of the inherent high variability of the intonation phenomenon.

Keywords

Data mining Text-to-speech Prosody