|
|
|
|
LEADER |
01583 am a22002173u 4500 |
001 |
137673.2 |
042 |
|
|
|a dc
|
100 |
1 |
0 |
|a Dekel, Ofer
|e author
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Operations Research Center
|e contributor
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
|e contributor
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
|e contributor
|
700 |
1 |
0 |
|a Flajolet, Arthur
|e author
|
700 |
1 |
0 |
|a Haghtalab, Nika
|e author
|
700 |
1 |
0 |
|a Jaillet, Patrick
|e author
|
245 |
0 |
0 |
|a Online learning with a hint
|
260 |
|
|
|c 2021-11-08T15:22:31Z.
|
856 |
|
|
|z Get fulltext
|u https://hdl.handle.net/1721.1/137673.2
|
520 |
|
|
|a © 2017 Neural information processing systems foundation. All rights reserved. We study a variant of online linear optimization where the player receives a hint about the loss function at the beginning of each round. The hint is given in the form of a vector that is weakly correlated with the loss vector on that round. We show that the player can benefit from such a hint if the set of feasible actions is sufficiently round. Specifically, if the set is strongly convex, the hint can be used to guarantee a regret of O(log(T)), and if the set is q-uniformly convex for q ∈ (2, 3), the hint can be used to guarantee a regret of o(√T). In contrast, we establish Ω(VT) lower bounds on regret when the set of feasible actions is a polyhedron.
|
520 |
|
|
|a Office of Naval Research (Grant N00014-15-1-2083)
|
546 |
|
|
|a en
|
655 |
7 |
|
|a Article
|