Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control

The problem of adaptive traffic signal control in the multi-intersection system has attracted the attention of researchers. Among the existing methods, reinforcement learning has shown to be effective. However, the complex intersection features, heterogeneous intersection structures, and dynamic coo...

Full description

Bibliographic Details
Main Authors:	Hongwei Ge, Yumei Song, Chunguo Wu, Jiankang Ren, Guozhen Tan
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Deep reinforcement learning multi-intersection signal control Q-learning Q-value transfer cooperative
Online Access:	https://ieeexplore.ieee.org/document/8674720/

id	doaj-f7901d809faf410db68761d302808a32
record_format	Article
spelling	doaj-f7901d809faf410db68761d302808a322021-04-05T17:01:56ZengIEEEIEEE Access2169-35362019-01-017407974080910.1109/ACCESS.2019.29076188674720Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal ControlHongwei Ge0https://orcid.org/0000-0002-8937-1515Yumei Song1Chunguo Wu2Jiankang Ren3Guozhen Tan4College of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaCollege of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering, Jilin University, Changchun, ChinaCollege of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaCollege of Computer Science and Technology, Dalian University of Technology, Dalian, ChinaThe problem of adaptive traffic signal control in the multi-intersection system has attracted the attention of researchers. Among the existing methods, reinforcement learning has shown to be effective. However, the complex intersection features, heterogeneous intersection structures, and dynamic coordination for multiple intersections pose challenges for reinforcement learning-based algorithms. This paper proposes a cooperative deep Q-network with Q-value transfer (QT-CDQN) for adaptive multi-intersection signal control. In QT-CDQN, a multi-intersection traffic network in a region is modeled as a multi-agent reinforcement learning system. Each agent searches the optimal strategy to control an intersection by a deep Q-network that takes the discrete state encoding of traffic information as the network inputs. To work cooperatively, the agent considers the influence of the latest actions of its adjacencies in the process of policy learning. Especially, the optimal Q-values of the neighbor agents at the latest time step are transferred to the loss function of the Q-network. Moreover, the strategy of the target network and the mechanism of experience replay are used to improve the stability of the algorithm. The advantages of QT-CDQN lie not only in the effectiveness and scalability for the multi-intersection system but also in the versatility to deal with the heterogeneous intersection structures. The experimental studies under different road structures show that the QT-CDQN is competitive in terms of average queue length, average speed, and average waiting time when compared with the state-of-the-art algorithms. Furthermore, the experiments of recurring congestion and occasional congestion validate the adaptability of the QT-CDQN to dynamic traffic environments.https://ieeexplore.ieee.org/document/8674720/Deep reinforcement learningmulti-intersection signal controlQ-learningQ-value transfercooperative
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Hongwei Ge Yumei Song Chunguo Wu Jiankang Ren Guozhen Tan
spellingShingle	Hongwei Ge Yumei Song Chunguo Wu Jiankang Ren Guozhen Tan Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control IEEE Access Deep reinforcement learning multi-intersection signal control Q-learning Q-value transfer cooperative
author_facet	Hongwei Ge Yumei Song Chunguo Wu Jiankang Ren Guozhen Tan
author_sort	Hongwei Ge
title	Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control
title_short	Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control
title_full	Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control
title_fullStr	Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control
title_full_unstemmed	Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control
title_sort	cooperative deep q-learning with q-value transfer for multi-intersection signal control
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	The problem of adaptive traffic signal control in the multi-intersection system has attracted the attention of researchers. Among the existing methods, reinforcement learning has shown to be effective. However, the complex intersection features, heterogeneous intersection structures, and dynamic coordination for multiple intersections pose challenges for reinforcement learning-based algorithms. This paper proposes a cooperative deep Q-network with Q-value transfer (QT-CDQN) for adaptive multi-intersection signal control. In QT-CDQN, a multi-intersection traffic network in a region is modeled as a multi-agent reinforcement learning system. Each agent searches the optimal strategy to control an intersection by a deep Q-network that takes the discrete state encoding of traffic information as the network inputs. To work cooperatively, the agent considers the influence of the latest actions of its adjacencies in the process of policy learning. Especially, the optimal Q-values of the neighbor agents at the latest time step are transferred to the loss function of the Q-network. Moreover, the strategy of the target network and the mechanism of experience replay are used to improve the stability of the algorithm. The advantages of QT-CDQN lie not only in the effectiveness and scalability for the multi-intersection system but also in the versatility to deal with the heterogeneous intersection structures. The experimental studies under different road structures show that the QT-CDQN is competitive in terms of average queue length, average speed, and average waiting time when compared with the state-of-the-art algorithms. Furthermore, the experiments of recurring congestion and occasional congestion validate the adaptability of the QT-CDQN to dynamic traffic environments.
topic	Deep reinforcement learning multi-intersection signal control Q-learning Q-value transfer cooperative
url	https://ieeexplore.ieee.org/document/8674720/
work_keys_str_mv	AT hongweige cooperativedeepqlearningwithqvaluetransferformultiintersectionsignalcontrol AT yumeisong cooperativedeepqlearningwithqvaluetransferformultiintersectionsignalcontrol AT chunguowu cooperativedeepqlearningwithqvaluetransferformultiintersectionsignalcontrol AT jiankangren cooperativedeepqlearningwithqvaluetransferformultiintersectionsignalcontrol AT guozhentan cooperativedeepqlearningwithqvaluetransferformultiintersectionsignalcontrol
_version_	1721540429759905792

Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control

Similar Items