Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences

Behavioral scientists are increasingly able to conduct randomized experiments in settings that enable rapidly updating probabilities of assignment to treatments (i.e., arms). Thus, many behavioral science experiments can be usefully formulated as sequential decision problems. This article reviews ve...

Full description

Bibliographic Details
Main Authors:	Dean Eckles, Maurits Kaptein
Format:	Article
Language:	English
Published:	SAGE Publishing 2019-06-01
Series:	SAGE Open
Online Access:	https://doi.org/10.1177/2158244019851675

id	doaj-e4c3a73ebb5845c4a8d3471393455042
record_format	Article
spelling	doaj-e4c3a73ebb5845c4a8d34713934550422020-11-25T03:20:34ZengSAGE PublishingSAGE Open2158-24402019-06-01910.1177/2158244019851675Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral SciencesDean Eckles0Maurits Kaptein1Massachusetts Institute of Technology, Cambridge, USAJheronimus Academy of Data Science, ’s-Hertogenbosch, The NetherlandsBehavioral scientists are increasingly able to conduct randomized experiments in settings that enable rapidly updating probabilities of assignment to treatments (i.e., arms). Thus, many behavioral science experiments can be usefully formulated as sequential decision problems. This article reviews versions of the multiarmed bandit problem with an emphasis on behavioral science applications. One popular method for such problems is Thompson sampling, which is appealing for randomizing assignment and being asymptoticly consistent in selecting the best arm. Here, we show the utility of bootstrap Thompson sampling (BTS), which replaces the posterior distribution with the bootstrap distribution. This often has computational and practical advantages. We illustrate its robustness to model misspecification, which is a common concern in behavioral science applications. We show how BTS can be readily adapted to be robust to dependent data, such as repeated observations of the same units, which is common in behavioral science applications. We use simulations to illustrate parametric Thompson sampling and BTS for Bernoulli bandits, factorial Gaussian bandits, and bandits with repeated observations of the same units.https://doi.org/10.1177/2158244019851675
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Dean Eckles Maurits Kaptein
spellingShingle	Dean Eckles Maurits Kaptein Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences SAGE Open
author_facet	Dean Eckles Maurits Kaptein
author_sort	Dean Eckles
title	Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences
title_short	Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences
title_full	Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences
title_fullStr	Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences
title_full_unstemmed	Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences
title_sort	bootstrap thompson sampling and sequential decision problems in the behavioral sciences
publisher	SAGE Publishing
series	SAGE Open
issn	2158-2440
publishDate	2019-06-01
description	Behavioral scientists are increasingly able to conduct randomized experiments in settings that enable rapidly updating probabilities of assignment to treatments (i.e., arms). Thus, many behavioral science experiments can be usefully formulated as sequential decision problems. This article reviews versions of the multiarmed bandit problem with an emphasis on behavioral science applications. One popular method for such problems is Thompson sampling, which is appealing for randomizing assignment and being asymptoticly consistent in selecting the best arm. Here, we show the utility of bootstrap Thompson sampling (BTS), which replaces the posterior distribution with the bootstrap distribution. This often has computational and practical advantages. We illustrate its robustness to model misspecification, which is a common concern in behavioral science applications. We show how BTS can be readily adapted to be robust to dependent data, such as repeated observations of the same units, which is common in behavioral science applications. We use simulations to illustrate parametric Thompson sampling and BTS for Bernoulli bandits, factorial Gaussian bandits, and bandits with repeated observations of the same units.
url	https://doi.org/10.1177/2158244019851675
work_keys_str_mv	AT deaneckles bootstrapthompsonsamplingandsequentialdecisionproblemsinthebehavioralsciences AT mauritskaptein bootstrapthompsonsamplingandsequentialdecisionproblemsinthebehavioralsciences
_version_	1724617959742111744

Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences

Similar Items