کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5010774 1462381 2018 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Estimation of binaural speech intelligibility using machine learning
ترجمه فارسی عنوان
ارزیابی قابلیت تشخیص گفتار دو طرفه با استفاده از یادگیری ماشین
کلمات کلیدی
هوش مصنوعی سخنرانی، سخنرانی دو طرفه برآورد هدف، فراگیری ماشین، آزمون قافیه تشخیصی،
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی مکانیک
چکیده انگلیسی

We proposed and evaluated a speech intelligibility estimation method for binaural signals. The assumption here was that both the speech and competing noise are directional sources. In this case, when the speech and noise are located away from each other, the intelligibility generally improves since the auditory system can segregate these two streams. However, since intelligibility tests as well as its estimation is conducted based on monaurally-recorded signals, this potential increase in the intelligibility due to the segregation of sources is not accounted for, and the intelligibility is often under-estimated. Accordingly, in order to estimate the intelligibility taking into account this binaural advantage, we trained a mapping function between the subjective intelligibility and objective measures that account for the binaural advantage stated above. We attempted SNR calculation on (1) a simple binaural to monaural mix-down, which models the conventional estimation, (2) simple pooling of both binaural channels (pooled channel), (3) channel signal selection with the better SNR from left and right channels (better-ear), and (4) sub-band wise better-ear selection (band-wise better-ear). For the mapping function training, we tried neural networks (NN), support vector regression (SVR), and random forests (RF), and compared these to simple logistic regression (LR). We also investigated the sub-band configuration that gives the best estimation accuracy by balancing the frequency resolution and the amount of training data. It was found that the combination of the better-ear model and RF gave the best results, with root mean square error (RMSE) of about 0.11 and correlation of 0.92 in an open set test.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Acoustics - Volume 129, 1 January 2018, Pages 408-416
نویسندگان
, ,