TY - JOUR
T1 - Continuous residual reinforcement learning for traffic signal control optimization
AU - Aslani, Mohammad
AU - Seipel, Stefan
AU - Wiering, Marco
PY - 2018/8
Y1 - 2018/8
N2 - Traffic signal control can be naturally regarded as a reinforcement learning problem. Unfortunately, it is one of the most difficult classes of reinforcement learning problems owing to its large state space. A straightforward approach to address this challenge is to control traffic signals based on continuous reinforcement learning. Although they have been successful in traffic signal control, they may become unstable and fail to converge to near-optimal solutions. We develop adaptive traffic signal controllers based on continuous residual reinforcement learning (CRL-TSC) that is more stable. The effect of three feature functions is empirically investigated in a microscopic traffic simulation. Furthermore, the effects of departing streets, more actions, and the use of the spatial distribution of the vehicles on the performance of CRL-TSCs are assessed. The results show that the best setup of the CRL-TSC leads to saving average travel time by 15% in comparison to an optimized fixed-time controller.
AB - Traffic signal control can be naturally regarded as a reinforcement learning problem. Unfortunately, it is one of the most difficult classes of reinforcement learning problems owing to its large state space. A straightforward approach to address this challenge is to control traffic signals based on continuous reinforcement learning. Although they have been successful in traffic signal control, they may become unstable and fail to converge to near-optimal solutions. We develop adaptive traffic signal controllers based on continuous residual reinforcement learning (CRL-TSC) that is more stable. The effect of three feature functions is empirically investigated in a microscopic traffic simulation. Furthermore, the effects of departing streets, more actions, and the use of the spatial distribution of the vehicles on the performance of CRL-TSCs are assessed. The results show that the best setup of the CRL-TSC leads to saving average travel time by 15% in comparison to an optimized fixed-time controller.
KW - Traffic Control
KW - Reinforcement Learning
UR - http://www.mendeley.com/research/continuous-residual-reinforcement-learning-traffic-signal-control-optimization
U2 - 10.1139/cjce-2017-0408
DO - 10.1139/cjce-2017-0408
M3 - Article
SN - 0315-1468
VL - 45
SP - 690
EP - 702
JO - Canadian Journal of Civil Engineering
JF - Canadian Journal of Civil Engineering
IS - 8
M1 - cjce-2017-0408
ER -