کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4973766 1451710 2017 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Non-intrusive speech quality estimation as combination of estimates using multiple time-scale auditory features
ترجمه فارسی عنوان
برآورد کیفیت سخنرانی غیرمستقیم به عنوان ترکیبی از تخمین ها با استفاده از ویژگی های شنوایی چندگانه در طول زمان
کلمات کلیدی
غیر قابل نفوذ، کیفیت سخنرانی، ویژگی های چندگانه در طول زمان، مدل شنیداری سخنرانی ضعیف،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی
The human auditory system is modeled by different auditory models representing the distribution of speech sound energy in different channels across the cochlea using filter-banks of different bandwidths. In previous algorithms of non-intrusive speech quality evaluation, auditory features are determined using these auditory models on per frame basis and then averaged over the entire speech utterance. In these approaches, the effect of impulsive noise and other non-stationary noise effects get averaged over the utterance. To include the variations in the features of speech over time in the speech utterance, a multiple time-scale features approach has been proposed as the speech features vary from frame to frame that accounts for variation of noise characteristics over the speech utterance and thus its affect on quality mapping. In this work, non-intrusive speech quality evaluation has been done as an optimal linear combination of quality mapping called objective mean opinion score (MOS), computed using multiple time-scale estimates of features. The objective MOS of each of the multiple time-scale estimates (the combination of multiple active speeches) are obtained using a probabilistic approach. The overall objective MOS of the speech utterance is computed by taking the optimal linear combination of the estimated objective MOS using multiple time-scale estimates of features, where the optimality is based on the minimum mean square error (MMSE) criterion or correlation maximization criterion. The results are given in terms of Pearson's correlation coefficient and root mean square error (RMSE) between the subjective MOS and the estimated overall objective MOS for three different standard databases. The results have been compared with a single time-scale features approach, the ITU-T Recommendation P.563 and recent algorithms.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Digital Signal Processing - Volume 70, November 2017, Pages 114-124
نویسندگان
, ,