Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.

Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative streng...

Full description

Bibliographic Details
Main Authors: Greg Jensen, Fabian Muñoz, Yelda Alkan, Vincent P Ferrera, Herbert S Terrace
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC4583549?pdf=render
id doaj-8ac934abb3ae4b62bc89af239dcb9b0d
record_format Article
spelling doaj-8ac934abb3ae4b62bc89af239dcb9b0d2020-11-24T21:51:15ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582015-01-01119e100452310.1371/journal.pcbi.1004523Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.Greg JensenFabian MuñozYelda AlkanVincent P FerreraHerbert S TerraceTransitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.http://europepmc.org/articles/PMC4583549?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Greg Jensen
Fabian Muñoz
Yelda Alkan
Vincent P Ferrera
Herbert S Terrace
spellingShingle Greg Jensen
Fabian Muñoz
Yelda Alkan
Vincent P Ferrera
Herbert S Terrace
Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
PLoS Computational Biology
author_facet Greg Jensen
Fabian Muñoz
Yelda Alkan
Vincent P Ferrera
Herbert S Terrace
author_sort Greg Jensen
title Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_short Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_full Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_fullStr Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_full_unstemmed Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.
title_sort implicit value updating explains transitive inference performance: the betasort model.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2015-01-01
description Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.
url http://europepmc.org/articles/PMC4583549?pdf=render
work_keys_str_mv AT gregjensen implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
AT fabianmunoz implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
AT yeldaalkan implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
AT vincentpferrera implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
AT herbertsterrace implicitvalueupdatingexplainstransitiveinferenceperformancethebetasortmodel
_version_ 1725879651322036224