Cooperative Multiagent Deep Deterministic Policy Gradient (CoMADDPG) for Intelligent Connected Transportation with Unsignalized Intersection

Unsignalized intersection control is one of the most critical issues in intelligent transportation systems, which requires connected and automated vehicles to support more frequent information interaction and on-board computing. It is very promising to introduce reinforcement learning in the unsigna...

Full description

Bibliographic Details
Main Authors: Tianhao Wu, Mingzhi Jiang, Lin Zhang
Format: Article
Language:English
Published: Hindawi Limited 2020-01-01
Series:Mathematical Problems in Engineering
Online Access:http://dx.doi.org/10.1155/2020/1820527
Description
Summary:Unsignalized intersection control is one of the most critical issues in intelligent transportation systems, which requires connected and automated vehicles to support more frequent information interaction and on-board computing. It is very promising to introduce reinforcement learning in the unsignalized intersection control. However, the existing multiagent reinforcement learning algorithms, such as multiagent deep deterministic policy gradient (MADDPG), hardly handle a dynamic number of vehicles, which cannot meet the need of the real road condition. Thus, this paper proposes a Cooperative MADDPG (CoMADDPG) for connected vehicles at unsignalized intersection to solve this problem. Firstly, the scenario of multiple vehicles passing through an unsignalized intersection is formulated as a multiagent reinforcement learning (RL) problem. Secondly, MADDPG is redefined to adapt to the dynamic quantity agents, where each vehicle selects reference vehicles to construct a partial stationary environment, which is necessary for RL. Thirdly, this paper incorporates a novel vehicle selection method, which projects the reference vehicles on a virtual lane and selects the largest impact vehicles to construct the environment. At last, an intersection simulation platform is developed to evaluate the proposed method. According to the simulation result, CoMADDPG can reduce average travel time by 39.28% compared with the other optimization-based methods, which indicates that CoMADDPG has an excellent prospect in dealing with the scenario of unsignalized intersection control.
ISSN:1024-123X
1563-5147