دانلود رایگان مقاله: سیستم گفتمان تقویتی-یادگیری مبتنی بر تعامل انسان با روبات با پاداش الهام گرفته از اجتماع

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
559006	875029	2015	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Reinforcement-learning based dialogue system for human–robot interactions with socially-inspired rewards

ترجمه فارسی عنوان

سیستم گفتمان تقویتی-یادگیری مبتنی بر تعامل انسان با روبات با پاداش الهام گرفته از اجتماع

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

تعامل انسان و روبات؛مدیریت گفتمان مبتنی بر POMDP؛تقویت یادگیری؛شکل دادن پاداش

Human?robot interaction - روابط انسانی؟Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

سیستم گفتمان تقویتی-یادگیری مبتنی بر تعامل انسان با روبات با پاداش الهام گرفته از اجتماع

چکیده انگلیسی

• We integrate user appraisals in a POMDP-based dialogue manager procedure.
• We employ additional socially-inspired rewards in a RL setup to guide the learning.
• A unified framework for speeding up the policy optimisation and user adaptation.
• We consider a potential-based reward shaping with a sample efficient RL algorithm.
• Evaluated using both user simulator (information retrieval) and user trials (HRI).

This paper investigates some conditions under which polarized user appraisals gathered throughout the course of a vocal interaction between a machine and a human can be integrated in a reinforcement learning-based dialogue manager. More specifically, we discuss how this information can be cast into socially-inspired rewards for speeding up the policy optimisation for both efficient task completion and user adaptation in an online learning setting. For this purpose a potential-based reward shaping method is combined with a sample efficient reinforcement learning algorithm to offer a principled framework to cope with these potentially noisy interim rewards. The proposed scheme will greatly facilitate the system's development by allowing the designer to teach his system through explicit positive/negative feedbacks given as hints about task progress, in the early stage of training. At a later stage, the approach will be used as a way to ease the adaptation of the dialogue policy to specific user profiles. Experiments carried out using a state-of-the-art goal-oriented dialogue management framework, the Hidden Information State (HIS), support our claims in two configurations: firstly, with a user simulator in the tourist information domain (and thus simulated appraisals), and secondly, in the context of man–robot dialogue with real user trials.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 34, Issue 1, November 2015, Pages 256–274

نویسندگان

Emmanuel Ferreira, Fabrice Lefèvre,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : سیستم گفتمان تقویتی-یادگیری مبتنی بر تعامل انسان با روبات با پاداش الهام گرفته از اجتماع

دسترسی سریع

ارتباط

English Website