Semi-conditional planners for efficient planning under uncertainty with macro-actions

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 163-168). === Planning in large, partially observable domains is challenging, especially when good performance re...

Full description

Bibliographic Details
Main Author:	He, Ruijie
Other Authors:	Nicholas Roy.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2010
Subjects:	Aeronautics and Astronautics.
Online Access:	http://hdl.handle.net/1721.1/59660

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-59660
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-596602019-05-02T16:21:49Z Semi-conditional planners for efficient planning under uncertainty with macro-actions Efficient planning under uncertainty with macro-actions He, Ruijie Nicholas Roy. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Aeronautics and Astronautics. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010. Cataloged from PDF version of thesis. Includes bibliographical references (p. 163-168). Planning in large, partially observable domains is challenging, especially when good performance requires considering situations far in the future. Existing planners typically construct a policy by performing fully conditional planning, where each future action is conditioned on a set of possible observations that could be obtained at every timestep. Unfortunately, fully-conditional planning can be computationally expensive, and state-of-the-art solvers are either limited in the size of problems that can be solved, or can only plan out to a limited horizon. We propose that for a large class of real-world, planning under uncertainty problems, it is necessary to perform far-lookahead decision-making, but unnecessary to construct policies that condition all actions on observations obtained at the previous timestep. Instead, these problems can be solved by performing semi conditional planning, where the constructed policy only conditions actions on observations at certain key points. Between these key points, the policy assumes that a macro-action - a temporally-extended, fixed length, open-loop action sequence, comprising a series of primitive actions, is executed. These macro-actions are evaluated within a forward-search framework, which only considers beliefs that are reachable from the agent's current belief under different actions and observations; a belief summarizes an agent's past history of actions and observations. Together, semi-conditional planning in a forward search manner restricts the policy space in exchange for conditional planning out to a longer-horizon. Two technical challenges have to be overcome in order to perform semi-conditional planning efficiently - how the macro-actions can be automatically generated, as well as how to efficiently incorporate the macro action into the forward search framework. We propose an algorithm which automatically constructs the macro-actions that are evaluated within a forward search planning framework, iteratively refining the macro actions as more computation time is made available for planning. In addition, we show that for a subset of problem domains, it is possible to analytically compute the distribution over posterior beliefs that result from a single macro-action. This ability to directly compute a distribution over posterior beliefs enables us to enjoy computational savings when performing macro-action forward search. Performance and computational analysis for the algorithms proposed in this thesis are presented, as well as simulation experiments that demonstrate superior performance relative to existing state-of-the-art solvers on large planning under uncertainty domains. We also demonstrate our planning under uncertainty algorithms on target-tracking applications for an actual autonomous helicopter, highlighting the practical potential for planning in real-world, long-horizon, partially observable domains. by Ruijie He. Ph.D. 2010-10-29T18:04:56Z 2010-10-29T18:04:56Z 2010 2010 Thesis http://hdl.handle.net/1721.1/59660 668106876 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 168 p. application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Aeronautics and Astronautics.
spellingShingle	Aeronautics and Astronautics. He, Ruijie Semi-conditional planners for efficient planning under uncertainty with macro-actions
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 163-168). === Planning in large, partially observable domains is challenging, especially when good performance requires considering situations far in the future. Existing planners typically construct a policy by performing fully conditional planning, where each future action is conditioned on a set of possible observations that could be obtained at every timestep. Unfortunately, fully-conditional planning can be computationally expensive, and state-of-the-art solvers are either limited in the size of problems that can be solved, or can only plan out to a limited horizon. We propose that for a large class of real-world, planning under uncertainty problems, it is necessary to perform far-lookahead decision-making, but unnecessary to construct policies that condition all actions on observations obtained at the previous timestep. Instead, these problems can be solved by performing semi conditional planning, where the constructed policy only conditions actions on observations at certain key points. Between these key points, the policy assumes that a macro-action - a temporally-extended, fixed length, open-loop action sequence, comprising a series of primitive actions, is executed. These macro-actions are evaluated within a forward-search framework, which only considers beliefs that are reachable from the agent's current belief under different actions and observations; a belief summarizes an agent's past history of actions and observations. Together, semi-conditional planning in a forward search manner restricts the policy space in exchange for conditional planning out to a longer-horizon. Two technical challenges have to be overcome in order to perform semi-conditional planning efficiently - how the macro-actions can be automatically generated, as well as how to efficiently incorporate the macro action into the forward search framework. We propose an algorithm which automatically constructs the macro-actions that are evaluated within a forward search planning framework, iteratively refining the macro actions as more computation time is made available for planning. In addition, we show that for a subset of problem domains, it is possible to analytically compute the distribution over posterior beliefs that result from a single macro-action. This ability to directly compute a distribution over posterior beliefs enables us to enjoy computational savings when performing macro-action forward search. Performance and computational analysis for the algorithms proposed in this thesis are presented, as well as simulation experiments that demonstrate superior performance relative to existing state-of-the-art solvers on large planning under uncertainty domains. We also demonstrate our planning under uncertainty algorithms on target-tracking applications for an actual autonomous helicopter, highlighting the practical potential for planning in real-world, long-horizon, partially observable domains. === by Ruijie He. === Ph.D.
author2	Nicholas Roy.
author_facet	Nicholas Roy. He, Ruijie
author	He, Ruijie
author_sort	He, Ruijie
title	Semi-conditional planners for efficient planning under uncertainty with macro-actions
title_short	Semi-conditional planners for efficient planning under uncertainty with macro-actions
title_full	Semi-conditional planners for efficient planning under uncertainty with macro-actions
title_fullStr	Semi-conditional planners for efficient planning under uncertainty with macro-actions
title_full_unstemmed	Semi-conditional planners for efficient planning under uncertainty with macro-actions
title_sort	semi-conditional planners for efficient planning under uncertainty with macro-actions
publisher	Massachusetts Institute of Technology
publishDate	2010
url	http://hdl.handle.net/1721.1/59660
work_keys_str_mv	AT heruijie semiconditionalplannersforefficientplanningunderuncertaintywithmacroactions AT heruijie efficientplanningunderuncertaintywithmacroactions
_version_	1719039473470144512

Semi-conditional planners for efficient planning under uncertainty with macro-actions

Similar Items