کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6961178 1452034 2015 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Quality prediction of synthesized speech based on perceptual quality dimensions
ترجمه فارسی عنوان
پیش بینی کیفیت سخنرانی سنتز شده بر اساس ابعاد کیفی ادراکی
کلمات کلیدی
پیش بینی کیفیت سخنرانی، ادراک گفتار مصنوعی، متن به گفتار، ارزیابی کیفی غیرمستقیم، تصحیح ادراکی،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی
Instrumental speech-quality prediction for text-to-speech signals is explored in a twofold manner. First, the perceptual quality space of TTS is structured by means of three perceptual quality dimensions which are derived from multiple auditory tests. Second, quality-prediction models are evaluated for each dimension using prosodic and MFCC-based measurands. Linear and nonlinear model types are compared under cross-validation restrictions, giving detailed insight into model-generalizability aspects. Perceptually regularized properties, denoted as quality elements, are introduced in order to encode the quality-indicative effect of individual signal characteristics. These elements integrate a perceptual model reference which is derived in a semi-supervised fashion from natural and synthetic speech. The results highlight the feasibility of instrumental quality prediction for TTS signals provided that broad training material is employed. High prediction accuracy, however, requires nonlinear model structures.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 66, February 2015, Pages 17-35
نویسندگان
, , , ,