کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563114 875471 2013 25 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification
چکیده انگلیسی

The dynamic use of voice qualities in spoken language can reveal useful information on a speakers attitude, mood and affective states. This information may be very desirable for a range of, both input and output, speech technology applications. However, voice quality annotation of speech signals may frequently produce far from consistent labeling. Groups of annotators may disagree on the perceived voice quality, but whom should one trust or is the truth somewhere in between? The current study looks first to describe a voice quality feature set that is suitable for differentiating voice qualities on a tense to breathy dimension. Further, the study looks to include these features as inputs to a fuzzy-input fuzzy-output support vector machine (F2SVM) algorithm, which is in turn capable of softly categorizing voice quality recordings. The F2SVM is compared in a thorough analysis to standard crisp approaches and shows promising results, while outperforming for example standard support vector machines with the sole difference being that the F2SVM approach receives fuzzy label information during training. Overall, it is possible to achieve accuracies of around 90% for both speaker dependent (cross validation) and speaker independent (leave one speaker out validation) experiments. Additionally, the approach using F2SVM performs at an accuracy of 82% for a cross corpus experiment (i.e. training and testing on entirely different recording conditions) in a frame-wise analysis and of around 97% after temporally integrating over full sentences. Furthermore, the output of fuzzy measures gave performances close to that of human annotators.


► We model breathy to tense voice qualities as a continuous dimension.
► We test if fuzzy information provided by annotators helps improve classification.
► Fuzzy SVM utilizing this information outperform standard approaches significantly.
► The approach generalizes well in cross corpus and leave one speaker out experiments.
► The employed feature set is suitable for differentiating these voice qualities.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 27, Issue 1, January 2013, Pages 263–287
نویسندگان
, , , ,