Online learning with a hint

© 2017 Neural information processing systems foundation. All rights reserved. We study a variant of online linear optimization where the player receives a hint about the loss function at the beginning of each round. The hint is given in the form of a vector that is weakly correlated with the loss ve...

Full description

Bibliographic Details
Main Authors: Dekel, Ofer (Author), Flajolet, Arthur (Author), Haghtalab, Nika (Author), Jaillet, Patrick (Author)
Other Authors: Massachusetts Institute of Technology. Operations Research Center (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: 2021-11-08T15:22:31Z.
Subjects:
Online Access:Get fulltext
LEADER 01583 am a22002173u 4500
001 137673.2
042 |a dc 
100 1 0 |a Dekel, Ofer  |e author 
100 1 0 |a Massachusetts Institute of Technology. Operations Research Center  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Laboratory for Information and Decision Systems  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science  |e contributor 
700 1 0 |a Flajolet, Arthur  |e author 
700 1 0 |a Haghtalab, Nika  |e author 
700 1 0 |a Jaillet, Patrick  |e author 
245 0 0 |a Online learning with a hint 
260 |c 2021-11-08T15:22:31Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/137673.2 
520 |a © 2017 Neural information processing systems foundation. All rights reserved. We study a variant of online linear optimization where the player receives a hint about the loss function at the beginning of each round. The hint is given in the form of a vector that is weakly correlated with the loss vector on that round. We show that the player can benefit from such a hint if the set of feasible actions is sufficiently round. Specifically, if the set is strongly convex, the hint can be used to guarantee a regret of O(log(T)), and if the set is q-uniformly convex for q ∈ (2, 3), the hint can be used to guarantee a regret of o(√T). In contrast, we establish Ω(VT) lower bounds on regret when the set of feasible actions is a polyhedron. 
520 |a Office of Naval Research (Grant N00014-15-1-2083) 
546 |a en 
655 7 |a Article