Thompson sampling-based online decision making in network routing
Online decision making is a kind of machine learning problems where decisions are made in a sequential manner so as to accumulate as many rewards as possible. Typical examples include multi-armed bandit (MAB) problems where an agent needs to decide which arm to pull in each round, and network rout...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English en |
Published: |
2020
|
Subjects: | |
Online Access: | http://hdl.handle.net/1828/12095 |