Finding minimal action sequences with a simple evaluation of actions

Animals are able to discover the minimal number of actions that achieves an outcome (the minimal action sequence). In most accounts of this, actions are associated with a measure of behavior that is higher for actions that lead to the outcome with a shorter action sequence, and learning mechanisms f...

Full description

Bibliographic Details
Main Authors:	Ashvin eShah, Kevin N. Gurney
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2014-11-01
Series:	Frontiers in Computational Neuroscience
Subjects:	Dopamine reinforcement learning intrinsic motivation optimal control redundancy action discovery
Online Access:	http://journal.frontiersin.org/Journal/10.3389/fncom.2014.00151/full

Description
Summary:	Animals are able to discover the minimal number of actions that achieves an outcome (the minimal action sequence). In most accounts of this, actions are associated with a measure of behavior that is higher for actions that lead to the outcome with a shorter action sequence, and learning mechanisms find the actions associated with the highest measure. In this sense, previous accounts focus on more than the simple binary signal of ``was the outcome achieved?''; they focus on ``how well was the outcome achieved?'' However, such mechanisms may not govern all types of behavioral development. In particular, in the process of action discovery (Redgrave and Gurney, 2006), actions are reinforced if they simply lead to a salient outcome because biological reinforcement signals occur too quickly to evaluate the consequences of an action beyond an indication of the outcome's occurrence. Thus, action discovery mechanisms focus on the simple evaluation of ``was the outcome achieved?'' and not ``how well was the outcome achieved?'' Notwithstanding this impoverishment of information, can the process of action discovery find the minimal action sequence? We address this question by implementing computational mechanisms, referred to in this paper as no-cost learning rules, in which each action that leads to the outcome is associated with the same measure of behavior. No-cost rules focus on ``was the outcome achieved?'' and are consistent with action discovery. No-cost rules discover the minimal action sequence in simulated tasks and execute it for a substantial amount of time. Extensive training, however, results in extraneous actions, suggesting that a separate process (which has been proposed in action discovery) must attenuate learning if no-cost rules participate in behavioral development. We describe how no-cost rules develop behavior, what happens when attenuation is disrupted, and relate the new mechanisms to wider computational and biological context.
ISSN:	1662-5188

Finding minimal action sequences with a simple evaluation of actions

Similar Items