کد مقاله کد نشریه سال انتشار مقاله انگلیسی ترجمه فارسی نسخه تمام متن
5010774 1369251 2018 9 صفحه PDF سفارش دهید دانلود کنید
عنوان انگلیسی مقاله
Estimation of binaural speech intelligibility using machine learning
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی مکانیک
پیش نمایش صفحه اول مقاله
Estimation of binaural speech intelligibility using machine learning
چکیده انگلیسی

We proposed and evaluated a speech intelligibility estimation method for binaural signals. The assumption here was that both the speech and competing noise are directional sources. In this case, when the speech and noise are located away from each other, the intelligibility generally improves since the auditory system can segregate these two streams. However, since intelligibility tests as well as its estimation is conducted based on monaurally-recorded signals, this potential increase in the intelligibility due to the segregation of sources is not accounted for, and the intelligibility is often under-estimated. Accordingly, in order to estimate the intelligibility taking into account this binaural advantage, we trained a mapping function between the subjective intelligibility and objective measures that account for the binaural advantage stated above. We attempted SNR calculation on (1) a simple binaural to monaural mix-down, which models the conventional estimation, (2) simple pooling of both binaural channels (pooled channel), (3) channel signal selection with the better SNR from left and right channels (better-ear), and (4) sub-band wise better-ear selection (band-wise better-ear). For the mapping function training, we tried neural networks (NN), support vector regression (SVR), and random forests (RF), and compared these to simple logistic regression (LR). We also investigated the sub-band configuration that gives the best estimation accuracy by balancing the frequency resolution and the amount of training data. It was found that the combination of the better-ear model and RF gave the best results, with root mean square error (RMSE) of about 0.11 and correlation of 0.92 in an open set test.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Acoustics - Volume 129, 1 January 2018, Pages 408-416
نویسندگان
, ,