Imitating Dialog Strategies Under Uncertainty

Article ID	Journal	Published Year	Pages	File Type
487576	Procedia Computer Science	2014	8 Pages	PDF

Abstract

We consider human-robot interaction involving a service robot and many different users in a public environment. The task is to learn a dialog policy that deals with changing user goals, can act under uncertainty, and is easy to apply in practice. Unlike reinforcement- learning-based systems, our simulator-free approach avoids common problems such as reward tuning and state space exploration: We apply imitation learning in order to mimic an expert's behavior based on a small number of Wizard-of-Oz experiments. A dynamic Bayesian Network is used to track hidden user goals. We evaluate our approach in a simulated environment and show that by using lifelong model updates it is possible to apply the expert's policy correctly even if the user behavior changes over time.