Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
413430 | Robotics and Autonomous Systems | 2013 | 15 Pages |
•We propose an online learning strategy for locomotion in modular robots.•The strategy is designed to be minimalistic and distributed.•We experimentally study the strategy in simulation and on physical modular robots.•Our findings include the fact that the strategy is morphology independent and can adapt to module faults and self-reconfiguration.
In this paper, we present a distributed reinforcement learning strategy for morphology-independent life-long gait learning for modular robots. All modules run identical controllers that locally and independently optimize their action selection based on the robot’s velocity as a global, shared reward signal. We evaluate the strategy experimentally mainly on simulated, but also on physical, modular robots. We find that the strategy: (i) for six of seven configurations (3–12 modules) converge in 96% of the trials to the best known action-based gaits within 15 min, on average, (ii) can be transferred to physical robots with a comparable performance, (iii) can be applied to learn simple gait control tables for both M-TRAN and ATRON robots, (iv) enables an 8-module robot to adapt to faults and changes in its morphology, and (v) can learn gaits for up to 60 module robots but a divergence effect becomes substantial from 20–30 modules. These experiments demonstrate the advantages of a distributed learning strategy for modular robots, such as simplicity in implementation, low resource requirements, morphology independence, reconfigurability, and fault tolerance.