Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social di...

Full description

Bibliographic Details
Main Authors:	Takahiro Ezaki, Yutaka Horita, Masanori Takezawa, Naoki Masuda
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2016-07-01
Series:	PLoS Computational Biology
Online Access:	https://doi.org/10.1371/journal.pcbi.1005034

id	doaj-6a902c60947343a18c54c434409f9adb
record_format	Article
spelling	doaj-6a902c60947343a18c54c434409f9adb2021-04-21T15:33:53ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582016-07-01127e100503410.1371/journal.pcbi.1005034Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.Takahiro EzakiYutaka HoritaMasanori TakezawaNaoki MasudaDirect reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner's dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.https://doi.org/10.1371/journal.pcbi.1005034
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Takahiro Ezaki Yutaka Horita Masanori Takezawa Naoki Masuda
spellingShingle	Takahiro Ezaki Yutaka Horita Masanori Takezawa Naoki Masuda Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin. PLoS Computational Biology
author_facet	Takahiro Ezaki Yutaka Horita Masanori Takezawa Naoki Masuda
author_sort	Takahiro Ezaki
title	Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.
title_short	Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.
title_full	Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.
title_fullStr	Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.
title_full_unstemmed	Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.
title_sort	reinforcement learning explains conditional cooperation and its moody cousin.
publisher	Public Library of Science (PLoS)
series	PLoS Computational Biology
issn	1553-734X 1553-7358
publishDate	2016-07-01
description	Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner's dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.
url	https://doi.org/10.1371/journal.pcbi.1005034
work_keys_str_mv	AT takahiroezaki reinforcementlearningexplainsconditionalcooperationanditsmoodycousin AT yutakahorita reinforcementlearningexplainsconditionalcooperationanditsmoodycousin AT masanoritakezawa reinforcementlearningexplainsconditionalcooperationanditsmoodycousin AT naokimasuda reinforcementlearningexplainsconditionalcooperationanditsmoodycousin
_version_	1714667271810449408

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin.

Similar Items