Deep Deterministic Policy Gradient Based on Double Network Prioritized Experience Replay
The traditional deep deterministic policy gradient (DDPG) algorithm has the disadvantages of slow convergence velocity and ease of falling into the local optimum. From these two perspectives, a DDPG algorithm based on the double network prioritized experience replay mechanism (DNPER-DDPG) is propose...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9409070/ |