Contextual Online Imitation Learning (COIL: Using Guide Policies in Reinforcement Learning

Alexander Hill*, Marc Groefsema, Matthia Sabatelli, Raffaella Carloni, Marco Grzegorczyk

*Corresponding author voor dit werk

OnderzoeksoutputAcademic

103 Downloads (Pure)

Samenvatting

This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper demonstrates that COIL can offer improved performance over both offline Imitation Learning methods such as Behavioral Cloning, and also Reinforcement Learning algorithms such as Proximal Policy Optimisation which do not take advantage of existing guide policies. An important characteristic of COIL is that it can effectively utilise guide policies that exhibit expert behavior in only a strict subset of the state space, making it more flexible than classical methods of Imitation Learning. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After introducing the theory and motivation behind COIL
Originele taal-2English
Pagina's178
Aantal pagina's185
DOI's
StatusPublished - feb.-2024
EvenementInternational Conference on Agents and Artificial Intelligence - Rome, Italy
Duur: 24-feb.-202426-feb.-2024
https://icaart.scitevents.org/?y=2024

Conference

ConferenceInternational Conference on Agents and Artificial Intelligence
Verkorte titelICAART 2024
Land/RegioItaly
StadRome
Periode24/02/202426/02/2024
Internet adres

Vingerafdruk

Duik in de onderzoeksthema's van 'Contextual Online Imitation Learning (COIL: Using Guide Policies in Reinforcement Learning'. Samen vormen ze een unieke vingerafdruk.

Citeer dit