کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558213 | 1451691 | 2016 | 23 صفحه PDF | دانلود رایگان |
• Quantitative comparison of vocal tract profiles considering multiple regions.
• Normalised differences allowing intra- and inter-speaker comparisons.
• Analysis of speech production considering multiple speakers/realisations.
• Abstract visual representation to support analysis of the computed difference data.
Articulatory data can nowadays be obtained using a wide range of techniques, with a notable emphasis on imaging modalities such as ultrasound and real-time magnetic resonance, resulting in large amounts of image data.One of the major challenges posed by these large datasets concerns how they can be efficiently analysed to extract relevant information to support speech production studies. Traditional approaches, including the superposition of vocal tract profiles, provide only a qualitative characterisation of notable properties and differences. While providing valuable information, these methods are rather inefficient and inherently subjective. Therefore, analysis must evolve towards a more automated, replicable and quantitative approach.To address these issues we propose the use of objective measures to compare the configurations assumed by the vocal tract during the production of different sounds. The proposed framework provides quantitative normalised data regarding differences covering meaningful regions under the influence of various articulators. An important part of the framework is the visual representation of the data, proposed to support analysis, and depicting the differences found and corresponding direction of change.The normalised nature of the computed data allows comparison among different sounds and speakers in a common representation.Representative application examples, concerning the articulatory characterisation of European Portuguese vowels, are presented to illustrate the capabilities of the proposed framework, both for static configurations and the assessment of dynamic aspects during speech production.
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 307–329