Upper Bounds on the Performance of Discretisation in Reinforcement Learning

Reinforcement learning is a machine learning framework whereby an agent learns to perform a task by maximising its total reward received for selecting actions in each state. The policy mapping states to actions that the agent learns is either represented explicitly, or implicitly through a value fun...

Full description

Bibliographic Details
Main Author: Michael Robin Mitchley
Format: Article
Language:English
Published: South African Institute of Computer Scientists and Information Technologists 2015-12-01
Series:South African Computer Journal
Subjects:
Online Access:http://sacj.cs.uct.ac.za/index.php/sacj/article/view/284
Description
Summary:Reinforcement learning is a machine learning framework whereby an agent learns to perform a task by maximising its total reward received for selecting actions in each state. The policy mapping states to actions that the agent learns is either represented explicitly, or implicitly through a value function. It is common in reinforcement learning to discretise a continuous state space using tile coding or binary features. We prove an upper bound on the performance of discretisation for direct policy representation or value function approximation.
ISSN:1015-7999
2313-7835