Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
413109	679739	2013	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Skills transfer Expectation-Maximization - انتظار-حداکثر سازی Dynamical systems - سیستم های دینامیکی Gaussian mixture model - مدل مخلوط Gaussian Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

چکیده انگلیسی

The democratization of robotics technology and the development of new actuators progressively bring robots closer to humans. The applications that can now be envisaged drastically contrast with the requirements of industrial robots. In standard manufacturing settings, the criterions used to assess performance are usually related to the robot’s accuracy, repeatability, speed or stiffness. Learning a control policy to actuate such robots is characterized by the search of a single solution for the task, with a representation of the policy consisting of moving the robot through a set of points to follow a trajectory. With new environments such as homes and offices populated with humans, the reproduction performance is portrayed differently. These robots are expected to acquire rich motor skills that can be generalized to new situations, while behaving safely in the vicinity of users. Skills acquisition can no longer be guided by a single form of learning, and must instead combine different approaches to continuously create, adapt and refine policies. The family of search strategies based on expectation-maximization (EM) looks particularly promising to cope with these new requirements. The exploration can be performed directly in the policy parameters space, by refining the policy together with exploration parameters represented in the form of covariances. With this formulation, RL can be extended to a multi-optima search problem in which several policy alternatives can be considered. We present here two applications exploiting EM-based exploration strategies, by considering parameterized policies based on dynamical systems, and by using Gaussian mixture models for the search of multiple policy alternatives.

► Reinforcement learning with exploration in the policy parameters space.
► Exploration noise learned together with the policy parameters.
► Compact representation of movement and compliance with dynamical systems.
► Multi-optima policy search with expectation-maximization algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Robotics and Autonomous Systems - Volume 61, Issue 4, April 2013, Pages 369–379

نویسندگان

Sylvain Calinon, Petar Kormushev, Darwin G. Caldwell,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

دسترسی سریع

ارتباط

English Website