Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare
The recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging proje...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8668391/ |
id |
doaj-9337334707664439b09eea0e7bd8b7e6 |
---|---|
record_format |
Article |
spelling |
doaj-9337334707664439b09eea0e7bd8b7e62021-03-29T22:13:13ZengIEEEIEEE Access2169-35362019-01-017374323744710.1109/ACCESS.2019.29056498668391Deep Reinforcement Learning for Target Searching in Cognitive Electronic WarfareShixun You0Ming Diao1Lipeng Gao2https://orcid.org/0000-0002-2287-6279College of Information and Communication, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication, Harbin Engineering University, Harbin, ChinaThe recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging project, and due to the sensitivity of the data sources, there are few open studies that have investigated CEW. Moreover, the spatial sparsity, continuous action, and partially observable environment that exist in CEW have greatly limited the abilities of DRL algorithms, which strongly depend on state-value and action-value functions. In this paper, we use Python to build a 3-D space game named Explorer to simulate various CEW environments in which the electronic attacker is an unmanned combat air vehicle (UCAV) and the defender is an observation station, both of which are equipped with radar as the observation sensor. In our game, the UCAV needs to accomplish the task of detecting the target as early as possible to perform follow-up tracking and guidance tasks. To allow an "infant" UCAV to understand what "target searching" is, we train the UCAV's maneuvering strategies by means of a well-designed reward shaping, a simplified constant accelerated motion control, and a deep deterministic policy gradient (DDPG) algorithm based on a generative model and variational Bayesian estimation. The experimental results show that when the operating cycle is 0.2 s, the search success rate of the trained UCAV in 10000 episodes is improved by 33.36% compared with the benchmark, and the target destruction rate is similarly improved by 57.84%.https://ieeexplore.ieee.org/document/8668391/Deep reinforcement learningcognitive electronic warfaremotion planningdeep deterministic policy gradientvariational Bayesian estimationtarget searching |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Shixun You Ming Diao Lipeng Gao |
spellingShingle |
Shixun You Ming Diao Lipeng Gao Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare IEEE Access Deep reinforcement learning cognitive electronic warfare motion planning deep deterministic policy gradient variational Bayesian estimation target searching |
author_facet |
Shixun You Ming Diao Lipeng Gao |
author_sort |
Shixun You |
title |
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare |
title_short |
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare |
title_full |
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare |
title_fullStr |
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare |
title_full_unstemmed |
Deep Reinforcement Learning for Target Searching in Cognitive Electronic Warfare |
title_sort |
deep reinforcement learning for target searching in cognitive electronic warfare |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
The recent appreciation of deep reinforcement learning (DRL) arises from its successes in many domains, but the applications of DRL in practical engineering are still unsatisfactory, including optimizing control strategies in cognitive electronic warfare (CEW). CEW is a massive and challenging project, and due to the sensitivity of the data sources, there are few open studies that have investigated CEW. Moreover, the spatial sparsity, continuous action, and partially observable environment that exist in CEW have greatly limited the abilities of DRL algorithms, which strongly depend on state-value and action-value functions. In this paper, we use Python to build a 3-D space game named Explorer to simulate various CEW environments in which the electronic attacker is an unmanned combat air vehicle (UCAV) and the defender is an observation station, both of which are equipped with radar as the observation sensor. In our game, the UCAV needs to accomplish the task of detecting the target as early as possible to perform follow-up tracking and guidance tasks. To allow an "infant" UCAV to understand what "target searching" is, we train the UCAV's maneuvering strategies by means of a well-designed reward shaping, a simplified constant accelerated motion control, and a deep deterministic policy gradient (DDPG) algorithm based on a generative model and variational Bayesian estimation. The experimental results show that when the operating cycle is 0.2 s, the search success rate of the trained UCAV in 10000 episodes is improved by 33.36% compared with the benchmark, and the target destruction rate is similarly improved by 57.84%. |
topic |
Deep reinforcement learning cognitive electronic warfare motion planning deep deterministic policy gradient variational Bayesian estimation target searching |
url |
https://ieeexplore.ieee.org/document/8668391/ |
work_keys_str_mv |
AT shixunyou deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare AT mingdiao deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare AT lipenggao deepreinforcementlearningfortargetsearchingincognitiveelectronicwarfare |
_version_ |
1724191982091239424 |