Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition

Bibliographic Details
Main Authors: Jin, Chi (Author), Jin, Tiancheng (Author), Luo, Haipeng (Author), Sra, Suvrit (Author), Yu, Tiancheng (Author)
Format: Article
Language:English
Published: 2022-07-20T16:41:40Z.
Subjects:
Online Access:Get fulltext