Decoupling State Representation Methods from Reinforcement Learning in Car Racing

  • Juan M. Montoya
  • , Imant Daunhawer
  • , Julia E. Vogt
  • , Marco Wiering

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)
174 Downloads (Pure)

Abstract

In the quest for efficient and robust learning methods, combining unsupervised state representation learning and reinforcement learning (RL) could offer advantages for scaling RL algorithms by providing the models with a useful inductive bias. For achieving this, an encoder is trained in an unsupervised manner with two state representation methods, a variational autoencoder and a contrastive estimator. The learned features are then fed to the actor-critic RL algorithm Proximal Policy Optimization (PPO) to learn a policy for playing Open AI's car racing environment. Hence, such procedure permits to decouple state representations from RL-controllers. For the integration of RL with unsupervised learning, we explore various designs for variational autoencoders and contrastive learning. The proposed method is compared to a deep network trained directly on pixel inputs with PPO. The results show that the proposed method performs slightly worse than directly learning from pixel inputs; however, it has a more stable learning curve, a substantial reduction of the buffer size, and requires optimizing 88% fewer parameters. These results indicate that the use of pre-trained state representations has several benefits for solving RL tasks.

Original languageEnglish
Title of host publicationICAART 2021 - Proceedings of the 13th International Conference on Agents and Artificial Intelligence
EditorsAna Paula Rocha, Luc Steels, Jaap van den Herik
PublisherSciTePress
Pages752-759
Number of pages8
ISBN (Electronic)9789897584848
DOIs
Publication statusPublished - 2021
Event13th International Conference on Agents and Artificial Intelligence, ICAART 2021 - Virtual, Online
Duration: 4-Feb-20216-Feb-2021

Publication series

NameICAART 2021 - Proceedings of the 13th International Conference on Agents and Artificial Intelligence
Volume2

Conference

Conference13th International Conference on Agents and Artificial Intelligence, ICAART 2021
CityVirtual, Online
Period04/02/202106/02/2021

Keywords

  • Constrastive learning
  • Deep reinforcement learning
  • State representation learning
  • Variational autoencoders

Fingerprint

Dive into the research topics of 'Decoupling State Representation Methods from Reinforcement Learning in Car Racing'. Together they form a unique fingerprint.

Cite this