Search papers, labs, and topics across Lattice.
This paper formulates a reinforcement learning (RL) based adaptive traffic signal control algorithm capable of representing a full eight-phase ring-barrier configuration. The algorithm is trained using a distributed asynchronous architecture and evaluated under varying traffic demand conditions, benchmarked against state-of-the-practice actuated signal control (ASC). Results show that the RL-based signal control significantly outperforms optimized ASC, reducing average delay by 11-32%, and that training on diverse origin-destination (O-D) patterns leads to robust performance under unseen demand scenarios.
Training RL-based traffic signal controllers on diverse traffic patterns yields significantly more robust performance than controllers trained on single patterns, even outperforming state-of-the-art actuated signal control under highly dissimilar, unseen demand scenarios.
Reinforcement learning (RL) has attracted increasing interest for adaptive traffic signal control due to its model-free ability to learn control policies directly from interaction with the traffic environment. However, several challenges remain before RL-based signal control can be considered ready for field deployment. Many existing studies rely on simplified signal timing structures, robustness of trained models under varying traffic demand conditions remains insufficiently evaluated, and runtime efficiency continues to pose challenges when training RL algorithms in traffic microscopic simulation environments. This study formulates an RL-based signal control algorithm capable of representing a full eight-phase ring-barrier configuration consistent with field signal controllers. The algorithm is trained and evaluated under varying traffic demand conditions and benchmarked against state-of-the-practice actuated signal control (ASC). To assess robustness, experiments are conducted across multiple traffic volumes and origin-destination (O-D) demand patterns with varying levels of structural similarity. To improve training efficiency, a distributed asynchronous training architecture is implemented that enables parallel simulation across multiple computing nodes. Results from a case study intersection show that the proposed RL-based signal control significantly outperforms optimized ASC, reducing average delay by 11-32% across movements. A model trained on a single O-D pattern generalizes well to similar unseen demand patterns but degrades under substantially different demand conditions. In contrast, a model trained on diverse O-D patterns demonstrates strong robustness, consistently outperforming ASC even under highly dissimilar unseen demand scenarios.