دانلود رایگان مقاله: تشخیص فرمان گفتاری محلی در محیط چند اتاق و چند میکروفون

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4973724	1451681	2017	35 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Room-localized spoken command recognition in multi-room, multi-microphone environments

ترجمه فارسی عنوان

تشخیص فرمان گفتاری محلی در محیط چند اتاق و چند میکروفون

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Distant speech recognition Channel selection - انتخاب کانال Smart Homes - خانه های هوشمند Beamforming - شکل‌دهی پرتو Decision fusion - فیوژن تصمیم گیری Keyword spotting - کلمات کلیدی لبخند

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

تشخیص فرمان گفتاری محلی در محیط چند اتاق و چند میکروفون

چکیده انگلیسی

The paper focuses on the design of a practical system pipeline for always-listening, far-field spoken command recognition in everyday smart indoor environments that consist of multiple rooms equipped with sparsely distributed microphone arrays. Such environments, for example domestic and multi-room offices, present challenging acoustic scenes to state-of-the-art speech recognizers, especially under always-listening operation, due to low signal-to-noise ratios, frequent overlaps of target speech, acoustic events, and background noise, as well as inter-room interference and reverberation. In addition, recognition of target commands often needs to be accompanied by their spatial localization, at least at the room level, to account for users in different rooms, providing command disambiguation and room-localized feedback. To address the above requirements, the use of parallel recognition pipelines is proposed, one per room of interest. The approach is enabled by a room-dependent speech activity detection module that employs appropriate multichannel features to determine speech segments and their room of origin, feeding them to the corresponding room-dependent pipelines for further processing. These consist of the traditional cascade of far-field spoken command detection and recognition, the former based on the detection of “activating” key-phrases. Robustness to the challenging environments is pursued by a number of multichannel combination and acoustic modeling techniques, thoroughly investigated in the paper. In particular, channel selection, beamforming, and decision fusion of single-channel results are considered, with the latter performing best. Additional gains are observed, when the employed acoustic models are trained on appropriately simulated reverberant and noisy speech data, and are channel-adapted to the target environments. Further issues investigated concern the inter-dependencies of the various system components, demonstrating the superiority of joint optimization of the component tunable parameters over their separate or sequential optimization. The proposed approach is developed for the Greek language, exhibiting promising performance in real recordings in a four-room apartment, as well as a two-room office. For example, in the latter, a 76.6% command recognition accuracy is achieved on a speaker-independent test, employing a 180-sentence decoding grammar. This result represents a 46% relative improvement over conventional beamforming.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 46, November 2017, Pages 419-443

نویسندگان

Isidoros Rodomagoulakis, Athanasios Katsamanis, Gerasimos Potamianos, Panagiotis Giannoulis, Antigoni Tsiami, Petros Maragos,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تشخیص فرمان گفتاری محلی در محیط چند اتاق و چند میکروفون

دسترسی سریع

ارتباط

English Website