Online decision problems with large strategy sets

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2005. === Includes bibliographical references (p. 165-171). === In an online decision problem, an algorithm performs a sequence of trials, each of which involves selecting one element from a fixed set of alternatives (the...

Full description

Bibliographic Details
Main Author:	Kleinberg, Robert David
Other Authors:	F. Thomson Leighton.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2006
Subjects:	Mathematics.
Online Access:	http://hdl.handle.net/1721.1/33092

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-33092
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-330922019-05-02T15:47:44Z Online decision problems with large strategy sets Kleinberg, Robert David F. Thomson Leighton. Massachusetts Institute of Technology. Dept. of Mathematics. Massachusetts Institute of Technology. Dept. of Mathematics. Mathematics. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2005. Includes bibliographical references (p. 165-171). In an online decision problem, an algorithm performs a sequence of trials, each of which involves selecting one element from a fixed set of alternatives (the "strategy set") whose costs vary over time. After T trials, the combined cost of the algorithm's choices is compared with that of the single strategy whose combined cost is minimum. Their difference is called regret, and one seeks algorithms which are efficient in that their regret is sublinear in T and polynomial in the problem size. We study an important class of online decision problems called generalized multi- armed bandit problems. In the past such problems have found applications in areas as diverse as statistics, computer science, economic theory, and medical decision-making. Most existing algorithms were efficient only in the case of a small (i.e. polynomial- sized) strategy set. We extend the theory by supplying non-trivial algorithms and lower bounds for cases in which the strategy set is much larger (exponential or infinite) and the cost function class is structured, e.g. by constraining the cost functions to be linear or convex. As applications, we consider adaptive routing in networks, adaptive pricing in electronic markets, and collaborative decision-making by untrusting peers in a dynamic environment. by Robert David Kleinberg. Ph.D. 2006-06-19T17:39:44Z 2006-06-19T17:39:44Z 2005 2005 Thesis http://hdl.handle.net/1721.1/33092 62173704 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 171 p. 10061360 bytes 10071115 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Mathematics.
spellingShingle	Mathematics. Kleinberg, Robert David Online decision problems with large strategy sets
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2005. === Includes bibliographical references (p. 165-171). === In an online decision problem, an algorithm performs a sequence of trials, each of which involves selecting one element from a fixed set of alternatives (the "strategy set") whose costs vary over time. After T trials, the combined cost of the algorithm's choices is compared with that of the single strategy whose combined cost is minimum. Their difference is called regret, and one seeks algorithms which are efficient in that their regret is sublinear in T and polynomial in the problem size. We study an important class of online decision problems called generalized multi- armed bandit problems. In the past such problems have found applications in areas as diverse as statistics, computer science, economic theory, and medical decision-making. Most existing algorithms were efficient only in the case of a small (i.e. polynomial- sized) strategy set. We extend the theory by supplying non-trivial algorithms and lower bounds for cases in which the strategy set is much larger (exponential or infinite) and the cost function class is structured, e.g. by constraining the cost functions to be linear or convex. As applications, we consider adaptive routing in networks, adaptive pricing in electronic markets, and collaborative decision-making by untrusting peers in a dynamic environment. === by Robert David Kleinberg. === Ph.D.
author2	F. Thomson Leighton.
author_facet	F. Thomson Leighton. Kleinberg, Robert David
author	Kleinberg, Robert David
author_sort	Kleinberg, Robert David
title	Online decision problems with large strategy sets
title_short	Online decision problems with large strategy sets
title_full	Online decision problems with large strategy sets
title_fullStr	Online decision problems with large strategy sets
title_full_unstemmed	Online decision problems with large strategy sets
title_sort	online decision problems with large strategy sets
publisher	Massachusetts Institute of Technology
publishDate	2006
url	http://hdl.handle.net/1721.1/33092
work_keys_str_mv	AT kleinbergrobertdavid onlinedecisionproblemswithlargestrategysets
_version_	1719028282326777856

Online decision problems with large strategy sets

Similar Items