Exploring partially observable Markov decision processes by exploting structure and heuristic information

This thesis is about chance and choice, or decisions under uncertainty. The desire for creating an intelligent agent performing rewarding tasks in a realistic world urges for working models to do sequential decision making and planning. In responding to this grand wish, decision-theoretic plannin...

Full description

Bibliographic Details
Main Author:	Leung, Siu-Ki
Format:	Others
Language:	English
Published:	2009
Online Access:	http://hdl.handle.net/2429/5772

id	ndltd-UBC-oai-circle.library.ubc.ca-2429-5772
record_format	oai_dc
spelling	ndltd-UBC-oai-circle.library.ubc.ca-2429-57722018-01-05T17:32:42Z Exploring partially observable Markov decision processes by exploting structure and heuristic information Leung, Siu-Ki This thesis is about chance and choice, or decisions under uncertainty. The desire for creating an intelligent agent performing rewarding tasks in a realistic world urges for working models to do sequential decision making and planning. In responding to this grand wish, decision-theoretic planning (DTP) has evolved from decision theory and control theory, and has been applied to planning in artificial intelligence. Recent interest has been directed toward Markov Decision Processes (MDPs) introduced from operations research. While fruitful results have been tapped from research in fully observable MDPs, partially observable MDPs (POMDPs) are still too difficult to solve as observation uncertainties are incorporated. Abstraction and approximation techniques become the focus. This research attempts to enhance POMDPs by applying A l techniques. In particular, we transform the linear POMDP constructs into a structured representation based on binary decision trees and Bayesian Networks to achieve compactness. A handful of tree-oriented operations is then developed to perform structural belief updates and value computation. Along ii with the structured representation, we explore the belief space with a heuristic online search approach, in which best-first search strategy with heuristic pruning is employed. Experimenting with a structured testbed domain reveals great potentials of exploiting structure and heuristics to empower POMDPs for more practical applications. Science, Faculty of Computer Science, Department of Graduate 2009-03-09T19:51:22Z 2009-03-09T19:51:22Z 1996 1997-05 Text Thesis/Dissertation http://hdl.handle.net/2429/5772 eng For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. 4266139 bytes application/pdf
collection	NDLTD
language	English
format	Others
sources	NDLTD
description	This thesis is about chance and choice, or decisions under uncertainty. The desire for creating an intelligent agent performing rewarding tasks in a realistic world urges for working models to do sequential decision making and planning. In responding to this grand wish, decision-theoretic planning (DTP) has evolved from decision theory and control theory, and has been applied to planning in artificial intelligence. Recent interest has been directed toward Markov Decision Processes (MDPs) introduced from operations research. While fruitful results have been tapped from research in fully observable MDPs, partially observable MDPs (POMDPs) are still too difficult to solve as observation uncertainties are incorporated. Abstraction and approximation techniques become the focus. This research attempts to enhance POMDPs by applying A l techniques. In particular, we transform the linear POMDP constructs into a structured representation based on binary decision trees and Bayesian Networks to achieve compactness. A handful of tree-oriented operations is then developed to perform structural belief updates and value computation. Along ii with the structured representation, we explore the belief space with a heuristic online search approach, in which best-first search strategy with heuristic pruning is employed. Experimenting with a structured testbed domain reveals great potentials of exploiting structure and heuristics to empower POMDPs for more practical applications. === Science, Faculty of === Computer Science, Department of === Graduate
author	Leung, Siu-Ki
spellingShingle	Leung, Siu-Ki Exploring partially observable Markov decision processes by exploting structure and heuristic information
author_facet	Leung, Siu-Ki
author_sort	Leung, Siu-Ki
title	Exploring partially observable Markov decision processes by exploting structure and heuristic information
title_short	Exploring partially observable Markov decision processes by exploting structure and heuristic information
title_full	Exploring partially observable Markov decision processes by exploting structure and heuristic information
title_fullStr	Exploring partially observable Markov decision processes by exploting structure and heuristic information
title_full_unstemmed	Exploring partially observable Markov decision processes by exploting structure and heuristic information
title_sort	exploring partially observable markov decision processes by exploting structure and heuristic information
publishDate	2009
url	http://hdl.handle.net/2429/5772
work_keys_str_mv	AT leungsiuki exploringpartiallyobservablemarkovdecisionprocessesbyexplotingstructureandheuristicinformation
_version_	1718587188495515648

Exploring partially observable Markov decision processes by exploting structure and heuristic information

Similar Items