Contextual Online Imitation Learning (COIL: Using Guide Policies in Reinforcement Learning

Alexander Hill*, Marc Groefsema, Matthia Sabatelli, Raffaella Carloni, Marco Grzegorczyk

*Corresponding author for this work

Research output: Contribution to conferencePaperAcademic

103 Downloads (Pure)

Abstract

This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper demonstrates that COIL can offer improved performance over both offline Imitation Learning methods such as Behavioral Cloning, and also Reinforcement Learning algorithms such as Proximal Policy Optimisation which do not take advantage of existing guide policies. An important characteristic of COIL is that it can effectively utilise guide policies that exhibit expert behavior in only a strict subset of the state space, making it more flexible than classical methods of Imitation Learning. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After introducing the theory and motivation behind COIL
Original languageEnglish
Pages178
Number of pages185
DOIs
Publication statusPublished - Feb-2024
EventInternational Conference on Agents and Artificial Intelligence - Rome, Italy
Duration: 24-Feb-202426-Feb-2024
https://icaart.scitevents.org/?y=2024

Conference

ConferenceInternational Conference on Agents and Artificial Intelligence
Abbreviated titleICAART 2024
Country/TerritoryItaly
CityRome
Period24/02/202426/02/2024
Internet address

Fingerprint

Dive into the research topics of 'Contextual Online Imitation Learning (COIL: Using Guide Policies in Reinforcement Learning'. Together they form a unique fingerprint.

Cite this