Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.

Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative streng...

Full description

Bibliographic Details
Main Authors:	Greg Jensen, Fabian Muñoz, Yelda Alkan, Vincent P Ferrera, Herbert S Terrace
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2015-01-01
Series:	PLoS Computational Biology
Online Access:	http://europepmc.org/articles/PMC4583549?pdf=render

id	doaj-8ac934abb3ae4b62bc89af239dcb9b0d
record_format	Article
spelling	doaj-8ac934abb3ae4b62bc89af239dcb9b0d2020-11-24T21:51:15ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582015-01-01119e100452310.1371/journal.pcbi.1004523Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.Greg JensenFabian MuñozYelda AlkanVincent P FerreraHerbert S TerraceTransitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.http://europepmc.org/articles/PMC4583549?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Greg Jensen Fabian Muñoz Yelda Alkan Vincent P Ferrera Herbert S Terrace
spellingShingle	Greg Jensen Fabian Muñoz Yelda Alkan Vincent P Ferrera Herbert S Terrace Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model. PLoS Computational Biology
author_facet	Greg Jensen Fabian Muñoz Yelda Alkan Vincent P Ferrera Herbert S Terrace
author_sort	Greg Jensen
title	Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_short	Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_full	Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_fullStr	Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_full_unstemmed	Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_sort	implicit value updating explains transitive inference performance: the betasort model.
publisher	Public Library of Science (PLoS)
series	PLoS Computational Biology
issn	1553-734X 1553-7358
publishDate	2015-01-01
description	Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.
url	http://europepmc.org/articles/PMC4583549?pdf=render
work_keys_str_mv	AT gregjensen implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel AT fabianmunoz implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel AT yeldaalkan implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel AT vincentpferrera implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel AT herbertsterrace implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
_version_	1725879651322036224

Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.

Similar Items