Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition
Main Authors: | Jin, Chi (Author), Jin, Tiancheng (Author), Luo, Haipeng (Author), Sra, Suvrit (Author), Yu, Tiancheng (Author) |
---|---|
Format: | Article |
Language: | English |
Published: |
2022-07-20T16:41:40Z.
|
Subjects: | |
Online Access: | Get fulltext |
Similar Items
-
Online Learning in Unknown Markov Games
by: Tian, Yi, et al.
Published: (2022) -
Efficient Online Learning with Bandit Feedback
by: Liu, Fang
Published: (2020) -
Provably Efficient Algorithms for Multi-Objective Competitive RL
by: Yu, Tiancheng, et al.
Published: (2022) -
Prior convictions: Black-box adversarial attacks with bandits and priors
by: Ilyas, Andrew, et al.
Published: (2021) -
Multi-armed bandits with unconventional feedback
by: Gajane, Pratik
Published: (2017)