Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare

The recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging proje...

Full description

Bibliographic Details
Main Authors: Shixun You, Ming Diao, Lipeng Gao
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8668391/
id doaj-9337334707664439b09eea0e7bd8b7e6
record_format Article
spelling doaj-9337334707664439b09eea0e7bd8b7e62021-03-29T22:13:13ZengIEEEIEEE Access2169-35362019-01-017374323744710.1109/ACCESS.2019.29056498668391Deep Reinforcement Learning for Target Searching in Cognitive Electronic WarfareShixun You0Ming Diao1Lipeng Gao2https://orcid.org/0000-0002-2287-6279College of Information and Communication, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication, Harbin Engineering University, Harbin, ChinaThe recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging project, and due to the sensitivity of the data sources, there are few open studies that have investigated CEW. Moreover, the spatial sparsity, continuous action, and partially observable environment that exist in CEW have greatly limited the abilities of DRL algorithms, which strongly depend on state-value and action-value functions. In this paper, we use Python to build a 3-D space game named Explorer to simulate various CEW environments in which the electronic attacker is an unmanned combat air vehicle (UCAV) and the defender is an observation station, both of which are equipped with radar as the observation sensor. In our game, the UCAV needs to accomplish the task of detecting the target as early as possible to perform follow-up tracking and guidance tasks. To allow an "infant" UCAV to understand what "target searching" is, we train the UCAV's maneuvering strategies by means of a well-designed reward shaping, a simplified constant accelerated motion control, and a deep deterministic policy gradient (DDPG) algorithm based on a generative model and variational Bayesian estimation. The experimental results show that when the operating cycle is 0.2 s, the search success rate of the trained UCAV in 10000 episodes is improved by 33.36% compared with the benchmark, and the target destruction rate is similarly improved by 57.84%.https://ieeexplore.ieee.org/document/8668391/Deep reinforcement learningcognitive electronic warfaremotion planningdeep deterministic policy gradientvariational Bayesian estimationtarget searching
collection DOAJ
language English
format Article
sources DOAJ
author Shixun You
Ming Diao
Lipeng Gao
spellingShingle Shixun You
Ming Diao
Lipeng Gao
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
IEEE Access
Deep reinforcement learning
cognitive electronic warfare
motion planning
deep deterministic policy gradient
variational Bayesian estimation
target searching
author_facet Shixun You
Ming Diao
Lipeng Gao
author_sort Shixun You
title Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
title_short Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
title_full Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
title_fullStr Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
title_full_unstemmed Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
title_sort deep reinforcement learning for target searching in cognitive electronic warfare
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description The recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging project, and due to the sensitivity of the data sources, there are few open studies that have investigated CEW. Moreover, the spatial sparsity, continuous action, and partially observable environment that exist in CEW have greatly limited the abilities of DRL algorithms, which strongly depend on state-value and action-value functions. In this paper, we use Python to build a 3-D space game named Explorer to simulate various CEW environments in which the electronic attacker is an unmanned combat air vehicle (UCAV) and the defender is an observation station, both of which are equipped with radar as the observation sensor. In our game, the UCAV needs to accomplish the task of detecting the target as early as possible to perform follow-up tracking and guidance tasks. To allow an "infant" UCAV to understand what "target searching" is, we train the UCAV's maneuvering strategies by means of a well-designed reward shaping, a simplified constant accelerated motion control, and a deep deterministic policy gradient (DDPG) algorithm based on a generative model and variational Bayesian estimation. The experimental results show that when the operating cycle is 0.2 s, the search success rate of the trained UCAV in 10000 episodes is improved by 33.36% compared with the benchmark, and the target destruction rate is similarly improved by 57.84%.
topic Deep reinforcement learning
cognitive electronic warfare
motion planning
deep deterministic policy gradient
variational Bayesian estimation
target searching
url https://ieeexplore.ieee.org/document/8668391/
work_keys_str_mv AT shixunyou deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare
AT mingdiao deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare
AT lipenggao deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare
_version_ 1724191982091239424