Computational methods for Markov decision problems

In this thesis we study computational methods for finite discounted Markov decision problems and finite discounted parametric Markov decision problems over an infinite horizon. For the former problem our emphasis is on finding methods to significantly reduce the effort required to determine an optim...

Full description

Bibliographic Details
Main Author:	Shin, Moon Chirl
Language:	English
Published:	2010
Online Access:	http://hdl.handle.net/2429/22353

id	ndltd-UBC-oai-circle.library.ubc.ca-2429-22353
record_format	oai_dc
spelling	ndltd-UBC-oai-circle.library.ubc.ca-2429-223532018-01-05T17:41:37Z Computational methods for Markov decision problems Shin, Moon Chirl In this thesis we study computational methods for finite discounted Markov decision problems and finite discounted parametric Markov decision problems over an infinite horizon. For the former problem our emphasis is on finding methods to significantly reduce the effort required to determine an optimal policy. We discuss the implementation of Porteus' scalar extrapolation methods in the modified policy iteration algorithm and show that the results using only a final scalar extrapolation will be the same as those obtained by applying scalar extrapolation at each iteration and then using a final scalar extrapolation. Action elimination procedures for policy iteration and modified policy iteration algorithms are presented. The purpose of these techniques is to reduce the size of the action space to be searched in the improvement phase of the algorithm. A method for eliminating non-optimal actions for all subsequent iterations using upper and lower bounds on the optimal expected total discounted return is presented along with procedures for eliminating actions that cannot be part of the policy chosen in the improvement phase of the next iteration. A numerical comparison of these procedures on Howard's automobile replacement problem and on a large randomly generated problem suggests that using modified policy iteration together with one of the single iteration elimination procedures will lead to large savings in the computational time for problems with large state spaces. Modifications of the algorithm to reduce storage space are also discussed. For the finite discounted Markov decision problems in which the reward vector is parameterized by a scalar we present an algorithm to determine the optimal policy for each value of the parameter within an interval. The algorithm is based on using approximations of values to resolve difficulties caused by roundoff error. Also, several action elimination procedures are presented for this problem. Bi-criterion Markov decision problems and Markov decision problems with a single constraint are formulated as parametric Markov decision problems. A numerical comparison of algorithms with and without action elimination procedures is carried out on a two criterion version of Howard's automobile replacement problem. The results suggest that the algorithm with one of the action elimination procedures will lead to efficient solution of this problem. Business, Sauder School of Operations and Logistics (OPLOG), Division of Graduate 2010-03-23T19:41:41Z 2010-03-23T19:41:41Z 1980 Text Thesis/Dissertation http://hdl.handle.net/2429/22353 eng For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
collection	NDLTD
language	English
sources	NDLTD
description	In this thesis we study computational methods for finite discounted Markov decision problems and finite discounted parametric Markov decision problems over an infinite horizon. For the former problem our emphasis is on finding methods to significantly reduce the effort required to determine an optimal policy. We discuss the implementation of Porteus' scalar extrapolation methods in the modified policy iteration algorithm and show that the results using only a final scalar extrapolation will be the same as those obtained by applying scalar extrapolation at each iteration and then using a final scalar extrapolation. Action elimination procedures for policy iteration and modified policy iteration algorithms are presented. The purpose of these techniques is to reduce the size of the action space to be searched in the improvement phase of the algorithm. A method for eliminating non-optimal actions for all subsequent iterations using upper and lower bounds on the optimal expected total discounted return is presented along with procedures for eliminating actions that cannot be part of the policy chosen in the improvement phase of the next iteration. A numerical comparison of these procedures on Howard's automobile replacement problem and on a large randomly generated problem suggests that using modified policy iteration together with one of the single iteration elimination procedures will lead to large savings in the computational time for problems with large state spaces. Modifications of the algorithm to reduce storage space are also discussed. For the finite discounted Markov decision problems in which the reward vector is parameterized by a scalar we present an algorithm to determine the optimal policy for each value of the parameter within an interval. The algorithm is based on using approximations of values to resolve difficulties caused by roundoff error. Also, several action elimination procedures are presented for this problem. Bi-criterion Markov decision problems and Markov decision problems with a single constraint are formulated as parametric Markov decision problems. A numerical comparison of algorithms with and without action elimination procedures is carried out on a two criterion version of Howard's automobile replacement problem. The results suggest that the algorithm with one of the action elimination procedures will lead to efficient solution of this problem. === Business, Sauder School of === Operations and Logistics (OPLOG), Division of === Graduate
author	Shin, Moon Chirl
spellingShingle	Shin, Moon Chirl Computational methods for Markov decision problems
author_facet	Shin, Moon Chirl
author_sort	Shin, Moon Chirl
title	Computational methods for Markov decision problems
title_short	Computational methods for Markov decision problems
title_full	Computational methods for Markov decision problems
title_fullStr	Computational methods for Markov decision problems
title_full_unstemmed	Computational methods for Markov decision problems
title_sort	computational methods for markov decision problems
publishDate	2010
url	http://hdl.handle.net/2429/22353
work_keys_str_mv	AT shinmoonchirl computationalmethodsformarkovdecisionproblems
_version_	1718591997569138688

Computational methods for Markov decision problems

Similar Items