کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563110 875471 2013 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A dynamic tonal perception model for optimal pitch stylization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
A dynamic tonal perception model for optimal pitch stylization
چکیده انگلیسی

Automatic pitch stylization is an important resource for researchers working both on prosody and speech technologies. In order to be useful, the stylized F0 curve should contain the fewest possible number of control points while remaining, at the same time, close to the original curve from a perceptual point of view. Here, a pitch stylization algorithm aimed at finding the optimal balance between the number of employed control points and perceptual equality with respect to the original curve is presented. Rather than being defined by means of statistical closeness to the original F0 curve, the quality of the stylized curve is defined on the basis of a dynamic tonal perception model. The number of control points is optimized on the basis of previous results showing that the stylization can be more radical in those areas of the signal where tone perception is less accurate, i.e. in non-prominent areas. Perceptual tests show that, concerning the perceptual equality of the stylization, this approach performs as well as other reference ones, with the advantage of using a significantly lower number of control points. Although it is based on a theoretical background employing phonological units like syllables, the proposed, phonetic, approach does not require any preliminary segmentation or annotation step. It combines, instead, acoustic parameters related to syllabification and prominence detection into a single model which has been designed to be both integrated, in the sense that it does not introduce any pitfalls in the process, and dynamic, in the sense that it does not include rigid tonal perception thresholds.


► We present a pitch stylization algorithm based on a dynamic tonal perception model.
► We consider glissando perception capability as modulated by energy movements rather than through the use of thresholds.
► The presented approach is implemented as a process integrating features related to syllabification and prominence detection.
► Perceptual equality is preserved by the algorithm while less points are used in comparison with the reference approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 27, Issue 1, January 2013, Pages 190–208
نویسندگان
, , ,