Learning the distribution with largest mean: two bandit frameworks*
Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization. This paper reviews two different sequential learning tasks that have been considered in the bandit literature...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2017-01-01
|
Series: | ESAIM: Proceedings and Surveys |
Online Access: | https://doi.org/10.1051/proc/201760114 |