A Version of the Euler Equation in Discounted Markov Decision Processes
This paper deals with Markov decision processes (MDPs) on Euclidean spaces with an infinite horizon. An approach to study this kind of MDPs is using the dynamic programming technique (DP). Then the optimal value function is characterized through the value iteration functions. The paper provides cond...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2012-01-01
|
Series: | Journal of Applied Mathematics |
Online Access: | http://dx.doi.org/10.1155/2012/103698 |
id |
doaj-7451dcee33114737a686379c43ff4cf6 |
---|---|
record_format |
Article |
spelling |
doaj-7451dcee33114737a686379c43ff4cf62020-11-24T23:24:25ZengHindawi LimitedJournal of Applied Mathematics1110-757X1687-00422012-01-01201210.1155/2012/103698103698A Version of the Euler Equation in Discounted Markov Decision ProcessesH. Cruz-Suárez0G. Zacarías-Espinoza1V. Vázquez-Guevara2Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Avenida San Claudio y Río Verde, Col. San Manuel, CU, 72570 Puebla, PUE, MexicoFacultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Avenida San Claudio y Río Verde, Col. San Manuel, CU, 72570 Puebla, PUE, MexicoFacultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, Avenida San Claudio y Río Verde, Col. San Manuel, CU, 72570 Puebla, PUE, MexicoThis paper deals with Markov decision processes (MDPs) on Euclidean spaces with an infinite horizon. An approach to study this kind of MDPs is using the dynamic programming technique (DP). Then the optimal value function is characterized through the value iteration functions. The paper provides conditions that guarantee the convergence of maximizers of the value iteration functions to the optimal policy. Then, using the Euler equation and an envelope formula, the optimal solution of the optimal control problem is obtained. Finally, this theory is applied to a linear-quadratic control problem in order to find its optimal policy.http://dx.doi.org/10.1155/2012/103698 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
H. Cruz-Suárez G. Zacarías-Espinoza V. Vázquez-Guevara |
spellingShingle |
H. Cruz-Suárez G. Zacarías-Espinoza V. Vázquez-Guevara A Version of the Euler Equation in Discounted Markov Decision Processes Journal of Applied Mathematics |
author_facet |
H. Cruz-Suárez G. Zacarías-Espinoza V. Vázquez-Guevara |
author_sort |
H. Cruz-Suárez |
title |
A Version of the Euler Equation in Discounted Markov Decision Processes |
title_short |
A Version of the Euler Equation in Discounted Markov Decision Processes |
title_full |
A Version of the Euler Equation in Discounted Markov Decision Processes |
title_fullStr |
A Version of the Euler Equation in Discounted Markov Decision Processes |
title_full_unstemmed |
A Version of the Euler Equation in Discounted Markov Decision Processes |
title_sort |
version of the euler equation in discounted markov decision processes |
publisher |
Hindawi Limited |
series |
Journal of Applied Mathematics |
issn |
1110-757X 1687-0042 |
publishDate |
2012-01-01 |
description |
This paper deals with Markov decision processes (MDPs) on Euclidean spaces with an infinite horizon. An approach to study this kind of MDPs is using the dynamic programming technique (DP). Then the optimal value function is characterized through the value iteration functions. The paper provides conditions that guarantee the convergence of maximizers of the value iteration functions to the optimal policy. Then, using the Euler equation and an envelope formula, the optimal solution of the optimal control problem is obtained. Finally, this theory is applied to a linear-quadratic control problem in order to find its optimal policy. |
url |
http://dx.doi.org/10.1155/2012/103698 |
work_keys_str_mv |
AT hcruzsuarez aversionoftheeulerequationindiscountedmarkovdecisionprocesses AT gzacariasespinoza aversionoftheeulerequationindiscountedmarkovdecisionprocesses AT vvazquezguevara aversionoftheeulerequationindiscountedmarkovdecisionprocesses AT hcruzsuarez versionoftheeulerequationindiscountedmarkovdecisionprocesses AT gzacariasespinoza versionoftheeulerequationindiscountedmarkovdecisionprocesses AT vvazquezguevara versionoftheeulerequationindiscountedmarkovdecisionprocesses |
_version_ |
1725560741962973184 |