Reinforcement learning in robotic motion planning by combined experience-based planning and self-imitation learning

Sha Luo*, Lambert Schomaker

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

17 Downloads (Pure)

Abstract

High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks.
For real robots, it is challenging to collect enough qualified data either as
demonstrations for IL or experiences for RL due to safety consideration in
environments with obstacles. We target this challenge by proposing the selfimitation learning by planning plus (SILP+) algorithm, which efficiently embeds experience-based planning into the learning architecture to mitigate the
data-collection problem. The planner generates demonstrations based on successfully visited states from the current RL policy, and the policy improves
by learning from these demonstrations. In this way, we relieve the demand
for human expert operators to collect demonstrations required by IL and improve the RL performance as well. Various experimental results shows that
SILP+ achieves better training efficiency, higher and more stable success
rate in complex motion planning tasks compared to several other methods.
Extensive tests on physical robots illustrate the effectiveness of SILP+ in a
physical setting, retaining a success rate of 90% where the next-best contender drops from 87% to 75% in the Sim2Real transition.
Original languageEnglish
Article number104545
Number of pages14
JournalRobotics and Autonomous Systems
Volume170
Early online date9-Oct-2023
DOIs
Publication statusPublished - Dec-2023

Keywords

  • self-imitation learning
  • reinforcement learning
  • robotics
  • motion planning
  • obstacle avoidance

Cite this