Active Learning in Persistent Surveillance UAV Missions

The performance of many complex UAV decision-making problems can be extremely sensitive to small errors in the model parameters. One way of mitigating this sensitivity is by designing algorithms that more effectively learn the model throughout the course of a mission. This paper addresses this impor...

Full description

Bibliographic Details
Main Authors:	Redding, Joshua (Contributor), Bethke, Brett M. (Contributor), Bertuccelli, Luca F. (Contributor), How, Jonathan P. (Contributor)
Other Authors:	Massachusetts Institute of Technology. Aerospace Controls Laboratory (Contributor), Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor)
Format:	Article
Language:	English
Published:	American Institute of Aeronautics and Astronautics, 2013-10-23T15:08:12Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	02381 am a22002533u 4500
001	81479
042			\|a dc
100	1	0	\|a Redding, Joshua \|e author
100	1	0	\|a Massachusetts Institute of Technology. Aerospace Controls Laboratory \|e contributor
100	1	0	\|a Massachusetts Institute of Technology. Department of Aeronautics and Astronautics \|e contributor
100	1	0	\|a How, Jonathan P. \|e contributor
100	1	0	\|a Redding, Joshua \|e contributor
100	1	0	\|a Bethke, Brett M. \|e contributor
100	1	0	\|a Bertuccelli, Luca F. \|e contributor
700	1	0	\|a Bethke, Brett M. \|e author
700	1	0	\|a Bertuccelli, Luca F. \|e author
700	1	0	\|a How, Jonathan P. \|e author
245	0	0	\|a Active Learning in Persistent Surveillance UAV Missions
260			\|b American Institute of Aeronautics and Astronautics, \|c 2013-10-23T15:08:12Z.
856			\|z Get fulltext \|u http://hdl.handle.net/1721.1/81479
520			\|a The performance of many complex UAV decision-making problems can be extremely sensitive to small errors in the model parameters. One way of mitigating this sensitivity is by designing algorithms that more effectively learn the model throughout the course of a mission. This paper addresses this important problem by considering model uncertainty in a multi-agent Markov Decision Process (MDP) and using an active learning approach to quickly learn transition model parameters. We build on previous research that allowed UAVs to passively update model parameter estimates by incorporating new state transition observations. In this work, however, the UAVs choose to actively reduce the uncertainty in their model parameters by taking exploratory and informative actions. These actions result in a faster adaptation and, by explicitly accounting for UAV fuel dynamics, also mitigates the risk of the exploration. This paper compares the nominal, passive learning approach against two methods for incorporating active learning into the MDP framework: (1) All state transitions are rewarded equally, and (2) State transition rewards are weighted according to the expected resulting reduction in the variance of the model parameter. In both cases, agent behaviors emerge that enable faster convergence of the uncertain model parameters to their true values.
546			\|a en_US
655	7		\|a Article
773			\|t Proceedings of the AIAA Infotech@Aerospace Conference and AIAA Unmanned...Unlimited Conference

Active Learning in Persistent Surveillance UAV Missions

Similar Items