دانلود رایگان مقاله: ارزیابی قابلیت تشخیص گفتار دو طرفه با استفاده از یادگیری ماشین

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
5010774	1462381	2018	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Estimation of binaural speech intelligibility using machine learning

ترجمه فارسی عنوان

ارزیابی قابلیت تشخیص گفتار دو طرفه با استفاده از یادگیری ماشین

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

هوش مصنوعی سخنرانی، سخنرانی دو طرفه برآورد هدف، فراگیری ماشین، آزمون قافیه تشخیصی،

Speech intelligibility - هوش مصنوعی سخنرانی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی مهندسی مکانیک

پیش نمایش مقاله

ارزیابی قابلیت تشخیص گفتار دو طرفه با استفاده از یادگیری ماشین

چکیده انگلیسی

We proposed and evaluated a speech intelligibility estimation method for binaural signals. The assumption here was that both the speech and competing noise are directional sources. In this case, when the speech and noise are located away from each other, the intelligibility generally improves since the auditory system can segregate these two streams. However, since intelligibility tests as well as its estimation is conducted based on monaurally-recorded signals, this potential increase in the intelligibility due to the segregation of sources is not accounted for, and the intelligibility is often under-estimated. Accordingly, in order to estimate the intelligibility taking into account this binaural advantage, we trained a mapping function between the subjective intelligibility and objective measures that account for the binaural advantage stated above. We attempted SNR calculation on (1) a simple binaural to monaural mix-down, which models the conventional estimation, (2) simple pooling of both binaural channels (pooled channel), (3) channel signal selection with the better SNR from left and right channels (better-ear), and (4) sub-band wise better-ear selection (band-wise better-ear). For the mapping function training, we tried neural networks (NN), support vector regression (SVR), and random forests (RF), and compared these to simple logistic regression (LR). We also investigated the sub-band configuration that gives the best estimation accuracy by balancing the frequency resolution and the amount of training data. It was found that the combination of the better-ear model and RF gave the best results, with root mean square error (RMSE) of about 0.11 and correlation of 0.92 in an open set test.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Acoustics - Volume 129, 1 January 2018, Pages 408-416

نویسندگان

Kazuhiro Kondo, Kazuya Taira,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : ارزیابی قابلیت تشخیص گفتار دو طرفه با استفاده از یادگیری ماشین

دسترسی سریع

ارتباط

English Website