The Nature of Belief-Directed Exploratory Choice in Human Decision-Making

In non-stationary environments, there is a conflict between exploiting currently favored options and gaining information by exploring lesser-known options that in the past have proven less rewarding. Optimal decision making in such tasks requires considering future states of the environment (i.e., p...

Full description

Bibliographic Details
Main Authors:	W. Bradley Knox, A. Ross Otto, Peter eStone, Bradley eLove
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2012-01-01
Series:	Frontiers in Psychology
Subjects:	Decision Making reinforcement learning exploration planning exploitation ideal actor
Online Access:	http://journal.frontiersin.org/Journal/10.3389/fpsyg.2011.00398/full

id	doaj-0870c22fd1064b3a8ba2c763f16c6828
record_format	Article
spelling	doaj-0870c22fd1064b3a8ba2c763f16c68282020-11-24T22:13:52ZengFrontiers Media S.A.Frontiers in Psychology1664-10782012-01-01210.3389/fpsyg.2011.0039819266The Nature of Belief-Directed Exploratory Choice in Human Decision-MakingW. Bradley Knox0A. Ross Otto1Peter eStone2Bradley eLove3University of Texas at AustinUniversity of Texas at AustinUniversity of Texas at AustinUniversity College LondonIn non-stationary environments, there is a conflict between exploiting currently favored options and gaining information by exploring lesser-known options that in the past have proven less rewarding. Optimal decision making in such tasks requires considering future states of the environment (i.e., planning) and properly updating beliefs about the state of environment after observing outcomes associated with choices. Optimal belief-updating is reflective in that beliefs can change without directly observing environmental change. For example, after ten seconds elapse, one might correctly believe that a traffic light last observed to be red is now more likely to be green. To understand human decision-making when rewards associated with choice options change over time, we develop a variant of the classic bandit task that is both rich enough to encompass relevant phenomena and sufficiently tractable to allow for ideal actor analysis of sequential choice behavior. We evaluate whether people update beliefs about the state of environment in a reflexive (i.e., only in response to observed changes in reward structure) or reflective manner. In contrast to purely "random" accounts of exploratory behavior, model-based analyses of the subjects’ choices and latencies indicate that people are reflective belief-updaters. However, unlike the Ideal Actor model, our analyses indicate that people's choice behavior does not reflect consideration of future environmental states. Thus, although people update beliefs in a reflective manner consistent with the ideal actor, they do not engage in optimal long-term planning, but instead myopically choose the option on every trial that is believed to have the highest immediate payoff.http://journal.frontiersin.org/Journal/10.3389/fpsyg.2011.00398/fullDecision Makingreinforcement learningexplorationplanningexploitationideal actor
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	W. Bradley Knox A. Ross Otto Peter eStone Bradley eLove
spellingShingle	W. Bradley Knox A. Ross Otto Peter eStone Bradley eLove The Nature of Belief-Directed Exploratory Choice in Human Decision-Making Frontiers in Psychology Decision Making reinforcement learning exploration planning exploitation ideal actor
author_facet	W. Bradley Knox A. Ross Otto Peter eStone Bradley eLove
author_sort	W. Bradley Knox
title	The Nature of Belief-Directed Exploratory Choice in Human Decision-Making
title_short	The Nature of Belief-Directed Exploratory Choice in Human Decision-Making
title_full	The Nature of Belief-Directed Exploratory Choice in Human Decision-Making
title_fullStr	The Nature of Belief-Directed Exploratory Choice in Human Decision-Making
title_full_unstemmed	The Nature of Belief-Directed Exploratory Choice in Human Decision-Making
title_sort	nature of belief-directed exploratory choice in human decision-making
publisher	Frontiers Media S.A.
series	Frontiers in Psychology
issn	1664-1078
publishDate	2012-01-01
description	In non-stationary environments, there is a conflict between exploiting currently favored options and gaining information by exploring lesser-known options that in the past have proven less rewarding. Optimal decision making in such tasks requires considering future states of the environment (i.e., planning) and properly updating beliefs about the state of environment after observing outcomes associated with choices. Optimal belief-updating is reflective in that beliefs can change without directly observing environmental change. For example, after ten seconds elapse, one might correctly believe that a traffic light last observed to be red is now more likely to be green. To understand human decision-making when rewards associated with choice options change over time, we develop a variant of the classic bandit task that is both rich enough to encompass relevant phenomena and sufficiently tractable to allow for ideal actor analysis of sequential choice behavior. We evaluate whether people update beliefs about the state of environment in a reflexive (i.e., only in response to observed changes in reward structure) or reflective manner. In contrast to purely "random" accounts of exploratory behavior, model-based analyses of the subjects’ choices and latencies indicate that people are reflective belief-updaters. However, unlike the Ideal Actor model, our analyses indicate that people's choice behavior does not reflect consideration of future environmental states. Thus, although people update beliefs in a reflective manner consistent with the ideal actor, they do not engage in optimal long-term planning, but instead myopically choose the option on every trial that is believed to have the highest immediate payoff.
topic	Decision Making reinforcement learning exploration planning exploitation ideal actor
url	http://journal.frontiersin.org/Journal/10.3389/fpsyg.2011.00398/full
work_keys_str_mv	AT wbradleyknox thenatureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT arossotto thenatureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT peterestone thenatureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT bradleyelove thenatureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT wbradleyknox natureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT arossotto natureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT peterestone natureofbeliefdirectedexploratorychoiceinhumandecisionmaking AT bradleyelove natureofbeliefdirectedexploratorychoiceinhumandecisionmaking
_version_	1725799693481410560

The Nature of Belief-Directed Exploratory Choice in Human Decision-Making

Similar Items