Adaptive Urban Traffic Signal Optimization Using Reinforcement Learning: A D-DQN Approach

  • Unique Paper ID: 199867
  • Volume: 12
  • Issue: 11
  • PageNo: 14500-14513
  • Abstract:
  • Urban traffic congestion driven by increasing vehicle density and rigid fixed time signal control systems demands adaptive solutions. This paper presents a working adaptive traffic signal control system employing a Double Deep Q Network (D-DQN) reinforcement learning agent to dynamically optimize signal phase selection. The D-DQN architecture decouples action selection from action evaluation using separate policy and target networks, resolving the Q value overestimation instability inherent in standard DQN. Built on the SUMO simulation platform, the system introduces a dual parallel simulation methodology where RL controlled and fixed time baseline simulations execute simultaneously on identical traffic scenarios, enabling unbiased real time comparison. The agent operates on a 25-dimensional state vector encompassing local intersection metrics, neighbor intersection conditions and global network statistics, selecting among four discrete phase actions subject to safety constraints including minimum phase hold, maximum phase duration and starvation prevention. Two topologies are evaluated: a single four-way intersection and a 3×3 urban arterial grid with five signalized intersections. On the single intersection, D-DQN reduces average waiting time by 58.3%, improves speed by 43.8% and decreases queue length by 46.3%. On the urban arterial, the agent achieves 12.3% waiting time reduction, 4.9% speed improvement and 7.1% queue reduction. The system incorporates emergency vehicle detection with signal pre-emption and a real time React based dashboard for live performance visualization. Results demonstrate that D-DQN based control yields substantial flow quality improvements across both network topologies. A throughput flow quality tradeoff is observed across both topologies, consistent with phase holding optimization strategies documented in prior D-DQN studies.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{199867,
        author = {Prof. Puja R. Patil and Sumedh Gaikwad and Nimish Darne and Yash Joshi},
        title = {Adaptive Urban Traffic Signal Optimization Using Reinforcement Learning: A D-DQN Approach},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {11},
        pages = {14500-14513},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=199867},
        abstract = {Urban traffic congestion driven by increasing vehicle density and rigid fixed time signal control systems demands adaptive solutions. This paper presents a working adaptive traffic signal control system employing a Double Deep Q Network (D-DQN) reinforcement learning agent to dynamically optimize signal phase selection. The D-DQN architecture decouples action selection from action evaluation using separate policy and target networks, resolving the Q value overestimation instability inherent in standard DQN. Built on the SUMO simulation platform, the system introduces a dual parallel simulation methodology where RL controlled and fixed time baseline simulations execute simultaneously on identical traffic scenarios, enabling unbiased real time comparison. The agent operates on a 25-dimensional state vector encompassing local intersection metrics, neighbor intersection conditions and global network statistics, selecting among four discrete phase actions subject to safety constraints including minimum phase hold, maximum phase duration and starvation prevention. Two topologies are evaluated: a single four-way intersection and a 3×3 urban arterial grid with five signalized intersections. On the single intersection, D-DQN reduces average waiting time by 58.3%, improves speed by 43.8% and decreases queue length by 46.3%. On the urban arterial, the agent achieves 12.3% waiting time reduction, 4.9% speed improvement and 7.1% queue reduction. The system incorporates emergency vehicle detection with signal pre-emption and a real time React based dashboard for live performance visualization. Results demonstrate that D-DQN based control yields substantial flow quality improvements across both network topologies. A throughput flow quality tradeoff is observed across both topologies, consistent with phase holding optimization strategies documented in prior D-DQN studies.},
        keywords = {Double Deep Q Network (D-DQN), Reinforcement Learning, Traffic Signal Control, SUMO, Adaptive Signal Optimization, Dual Simulation Comparison, Emergency Vehicle Pre-emption, Intelligent Transportation Systems},
        month = {April},
        }

Cite This Article

Patil, P. P. R., & Gaikwad, S., & Darne, N., & Joshi, Y. (2026). Adaptive Urban Traffic Signal Optimization Using Reinforcement Learning: A D-DQN Approach. International Journal of Innovative Research in Technology (IJIRT), 12(11), 14500–14513.

Related Articles