دانلود رایگان مقاله: برآورد کل تعداد کل موثر برای طولانی مدت ضبط های صوتی طبیعی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4977870	1452016	2016	25 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Effective word count estimation for long duration daily naturalistic audio recordings

ترجمه فارسی عنوان

برآورد کل تعداد کل موثر برای طولانی مدت ضبط های صوتی طبیعی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Speaking rate - نرخ صحبت کردن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

برآورد کل تعداد کل موثر برای طولانی مدت ضبط های صوتی طبیعی

چکیده انگلیسی

The ability to count words in extended audio sequences allows researchers to explore characteristics of speakers (i.e., leading, following, task responsibility, personal engagement), as well as the dynamics of two-way or multi-subject conversation scenarios. As such, counting the number of words spoken by a person, offers a rich information source for several applications such as health monitoring (e.g., Autism, Parkinson's, Alzheimer's and etc), second language learning, or language development studies. However, developing robust word count systems that can achieve high performance with low computational cost is very challenging due to the uncertain and dynamic behavior experienced in audio recordings. In this study, we address the problem for large-scale naturalistic audio recordings based on a 100-day audio collection entitled (i.e., Prof-Life-Log). This corpus contains continuously recorded audio from one person using a mobile LENA audio recording device (LENA,Â 2015). The device captures audio for an entire workday which can last up to 16 hours. Our proposed framework to address word count consists of five main components, (i) Speech Activity Detection(SAD) to remove non-speech parts of the signal, (ii) Speech Enhancement to suppress the effects of background noise, (iii) Primary vs. Secondary Speaker Detection to remove secondary speaker segments, (iv) Syllable Rate Estimation to estimate the syllable rate for the primary speaker, and (v) Linear Minimum Mean Square Error Estimation (LMMSE) to find the linear mapping between syllable rate and word rate in spontaneous speech. In spite of the simplicity of the framework, it shows to be very effective in real scenarios with good performance on various datasets. As an indication of performance, the error of the framework for an entire 16Â h day audio file can be as low as 1% in terms of cumulative Word Count Error.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 84, November 2016, Pages 15-23

نویسندگان

Ali Ziaei, Abhijeet Sangwan, John H.L. Hansen,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : برآورد کل تعداد کل موثر برای طولانی مدت ضبط های صوتی طبیعی

دسترسی سریع

ارتباط

English Website