Samenvatting
This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper demonstrates that COIL can offer improved performance over both offline Imitation Learning methods such as Behavioral Cloning, and also Reinforcement Learning algorithms such as Proximal Policy Optimisation which do not take advantage of existing guide policies. An important characteristic of COIL is that it can effectively utilise guide policies that exhibit expert behavior in only a strict subset of the state space, making it more flexible than classical methods of Imitation Learning. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After introducing the theory and motivation behind COIL
Originele taal-2 | English |
---|---|
Pagina's | 178 |
Aantal pagina's | 185 |
DOI's | |
Status | Published - feb.-2024 |
Evenement | International Conference on Agents and Artificial Intelligence - Rome, Italy Duur: 24-feb.-2024 → 26-feb.-2024 https://icaart.scitevents.org/?y=2024 |
Conference
Conference | International Conference on Agents and Artificial Intelligence |
---|---|
Verkorte titel | ICAART 2024 |
Land/Regio | Italy |
Stad | Rome |
Periode | 24/02/2024 → 26/02/2024 |
Internet adres |