Completing Explorer Games with a Deep Reinforcement Learning Framework Based on Behavior Angle Navigation

In cognitive electronic warfare, when a typical combat vehicle, such as an unmanned combat air vehicle (UCAV), uses radar sensors to explore an unknown space, the target-searching fails due to an inefficient servoing/tracking system. Thus, to solve this problem, we developed an autonomous reasoning...

Full description

Bibliographic Details
Main Authors: Shixun You, Ming Diao, Lipeng Gao
Format: Article
Language:English
Published: MDPI AG 2019-05-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/8/5/576
Description
Summary:In cognitive electronic warfare, when a typical combat vehicle, such as an unmanned combat air vehicle (UCAV), uses radar sensors to explore an unknown space, the target-searching fails due to an inefficient servoing/tracking system. Thus, to solve this problem, we developed an autonomous reasoning search method that can generate efficient decision-making actions and guide the UCAV as early as possible to the target area. For high-dimensional continuous action space, the UCAV’s maneuvering strategies are subject to certain physical constraints. We first record the path histories of the UCAV as a sample set of supervised experiments and then construct a grid cell network using long short-term memory (LSTM) to generate a new displacement prediction to replace the target location estimation. Finally, we enable a variety of continuous-control-based deep reinforcement learning algorithms to output optimal/sub-optimal decision-making actions. All these tasks are performed in a three-dimensional target-searching simulator, i.e., the Explorer game. Please note that we use the behavior angle (BHA) for the first time as the main factor of the reward-shaping of the deep reinforcement learning framework and successfully make the trained UCAV achieve a 99.96% target destruction rate, i.e., the game win rate, in a 0.1 s operating cycle.
ISSN:2079-9292