Universal Reinforcement Learning

We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence future observations and costs. The goal is to minimize the long-term average cost. We propose a novel algorithm, known as the ac...

Full description

Bibliographic Details
Main Authors: Farias, Vivek F. (Contributor), Moallemi, Ciamac C. (Author), Van Roy, Benjamin (Author), Weissman, Tsachy (Author)
Other Authors: Sloan School of Management (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers, 2010-10-13T19:43:17Z.
Subjects:
Online Access:Get fulltext