Decision making with inference and learning methods

In this work we consider probabilistic approaches to sequential decision making. The ultimate goal is to provide methods by which decision making problems can be attacked by approaches and algorithms originally built for probabilistic inference. This allows us to directly apply a wide variety of pop...

Full description

Bibliographic Details
Main Author:	Hoffman, Matthew William
Language:	English
Published:	University of British Columbia 2013
Online Access:	http://hdl.handle.net/2429/44083

id	ndltd-UBC-oai-circle.library.ubc.ca-2429-44083
record_format	oai_dc
spelling	ndltd-UBC-oai-circle.library.ubc.ca-2429-440832018-01-05T17:26:28Z Decision making with inference and learning methods Hoffman, Matthew William In this work we consider probabilistic approaches to sequential decision making. The ultimate goal is to provide methods by which decision making problems can be attacked by approaches and algorithms originally built for probabilistic inference. This allows us to directly apply a wide variety of popular, practical algorithms to these tasks. In Chapter 1 we provide an overview of the general problem of sequential decision making and a broad description of various solution methods. Much of the remaining work of this thesis then proceeds by relying upon probabilistic reinterpretations of the decision making process. This strategy of reducing learning problems to simpler inference tasks has been shown to be very fruitful in much of machine learning, and we expect similar improvements to arise in the control and reinforcement learning fields. The approaches of Chapters 2–3 build upon the framework of [Toussaint and Storkey, 2006] in reformulating the solution of Markov decision processes instead as maximum-likelihood estimation in an equivalent probabilistic model. In Chapter 2 we utilize this framework to construct an Expectation Maximization algorithm for continuous, linear-Gaussian models with mixture-of-Gaussian rewards. This approach extends popular linear-quadratic reward models to a much more general setting. We also show how to extend this probabilistic framework to continuous time processes. Chapter 3 further builds upon these methods to introduce a Bayesian approach to policy search using Markov chain Monte Carlo. In Chapter 4 we depart from the setting of direct policy search and instead consider value function estimation. In particular we utilize least-squares temporal difference learning to reduce the problem of value function estimation to a more standard regression problem. In this chapter we specifically tackle the use of regularization methods in order to encourage sparse solutions. In Chapters 5–6 we consider the task of optimization as a sequential decision problem. In the first of these chapters we introduce the bandit framework and discuss a number of variations. Then in Chapter 6 we discuss a related approach to optimization utilizing Bayesian estimates of the underlying, unknown function. We finally introduce a novel approach to choose between different underlying point selection heuristics. Science, Faculty of Computer Science, Department of Graduate 2013-03-28T18:52:57Z 2013-03-29T09:21:03Z 2013 2013-05 Text Thesis/Dissertation http://hdl.handle.net/2429/44083 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection	NDLTD
language	English
sources	NDLTD
description	In this work we consider probabilistic approaches to sequential decision making. The ultimate goal is to provide methods by which decision making problems can be attacked by approaches and algorithms originally built for probabilistic inference. This allows us to directly apply a wide variety of popular, practical algorithms to these tasks. In Chapter 1 we provide an overview of the general problem of sequential decision making and a broad description of various solution methods. Much of the remaining work of this thesis then proceeds by relying upon probabilistic reinterpretations of the decision making process. This strategy of reducing learning problems to simpler inference tasks has been shown to be very fruitful in much of machine learning, and we expect similar improvements to arise in the control and reinforcement learning fields. The approaches of Chapters 2–3 build upon the framework of [Toussaint and Storkey, 2006] in reformulating the solution of Markov decision processes instead as maximum-likelihood estimation in an equivalent probabilistic model. In Chapter 2 we utilize this framework to construct an Expectation Maximization algorithm for continuous, linear-Gaussian models with mixture-of-Gaussian rewards. This approach extends popular linear-quadratic reward models to a much more general setting. We also show how to extend this probabilistic framework to continuous time processes. Chapter 3 further builds upon these methods to introduce a Bayesian approach to policy search using Markov chain Monte Carlo. In Chapter 4 we depart from the setting of direct policy search and instead consider value function estimation. In particular we utilize least-squares temporal difference learning to reduce the problem of value function estimation to a more standard regression problem. In this chapter we specifically tackle the use of regularization methods in order to encourage sparse solutions. In Chapters 5–6 we consider the task of optimization as a sequential decision problem. In the first of these chapters we introduce the bandit framework and discuss a number of variations. Then in Chapter 6 we discuss a related approach to optimization utilizing Bayesian estimates of the underlying, unknown function. We finally introduce a novel approach to choose between different underlying point selection heuristics. === Science, Faculty of === Computer Science, Department of === Graduate
author	Hoffman, Matthew William
spellingShingle	Hoffman, Matthew William Decision making with inference and learning methods
author_facet	Hoffman, Matthew William
author_sort	Hoffman, Matthew William
title	Decision making with inference and learning methods
title_short	Decision making with inference and learning methods
title_full	Decision making with inference and learning methods
title_fullStr	Decision making with inference and learning methods
title_full_unstemmed	Decision making with inference and learning methods
title_sort	decision making with inference and learning methods
publisher	University of British Columbia
publishDate	2013
url	http://hdl.handle.net/2429/44083
work_keys_str_mv	AT hoffmanmatthewwilliam decisionmakingwithinferenceandlearningmethods
_version_	1718583721180790784

Decision making with inference and learning methods

Similar Items