Non-linear contextual bandits
The multi-armed bandit framework can be motivated by any problem where there is an abundance of choice and the utility of trying something new must be balanced with that of going with the status quo. This is a trade-off that is present in the everyday problem of where and what to eat: should I try...
Main Author: | |
---|---|
Language: | English |
Published: |
University of British Columbia
2012
|
Online Access: | http://hdl.handle.net/2429/42191 |