Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test

Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated t...

Full description

Bibliographic Details
Main Authors:	Daichi Wada, Sergio A. Araujo-Estrada, Shane Windsor
Format:	Article
Language:	English
Published:	MDPI AG 2021-09-01
Series:	Aerospace
Subjects:	attitude control deep reinforcement learning fixed-wing aircraft unmanned aerial vehicle wind tunnel test
Online Access:	https://www.mdpi.com/2226-4310/8/9/258

id	doaj-2ae47fa24b9f4382a808cdfd046e5d05
record_format	Article
spelling	doaj-2ae47fa24b9f4382a808cdfd046e5d052021-09-25T23:33:19ZengMDPI AGAerospace2226-43102021-09-01825825810.3390/aerospace8090258Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel TestDaichi Wada0Sergio A. Araujo-Estrada1Shane Windsor2Aeronautical Technology Directorate, Japan Aerospace Exploration Agency, Tokyo 181-0015, JapanDepartment of Aerospace Engineering, University of Bristol, Bristol BS8 1TR, UKDepartment of Aerospace Engineering, University of Bristol, Bristol BS8 1TR, UKNonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.https://www.mdpi.com/2226-4310/8/9/258attitude controldeep reinforcement learningfixed-wing aircraftunmanned aerial vehiclewind tunnel test
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Daichi Wada Sergio A. Araujo-Estrada Shane Windsor
spellingShingle	Daichi Wada Sergio A. Araujo-Estrada Shane Windsor Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test Aerospace attitude control deep reinforcement learning fixed-wing aircraft unmanned aerial vehicle wind tunnel test
author_facet	Daichi Wada Sergio A. Araujo-Estrada Shane Windsor
author_sort	Daichi Wada
title	Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test
title_short	Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test
title_full	Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test
title_fullStr	Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test
title_full_unstemmed	Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test
title_sort	unmanned aerial vehicle pitch control under delay using deep reinforcement learning with continuous action in wind tunnel test
publisher	MDPI AG
series	Aerospace
issn	2226-4310
publishDate	2021-09-01
description	Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.
topic	attitude control deep reinforcement learning fixed-wing aircraft unmanned aerial vehicle wind tunnel test
url	https://www.mdpi.com/2226-4310/8/9/258
work_keys_str_mv	AT daichiwada unmannedaerialvehiclepitchcontrolunderdelayusingdeepreinforcementlearningwithcontinuousactioninwindtunneltest AT sergioaaraujoestrada unmannedaerialvehiclepitchcontrolunderdelayusingdeepreinforcementlearningwithcontinuousactioninwindtunneltest AT shanewindsor unmannedaerialvehiclepitchcontrolunderdelayusingdeepreinforcementlearningwithcontinuousactioninwindtunneltest
_version_	1717368605607198720

Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test

Similar Items