Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning

The “Policy Improvement with Path Integrals” (PI2) [25] and “Covariance Matrix Adaptation - Evolutionary Strategy” [8] are considered to be state-of-the-art in direct reinforcement learning and stochastic optimization respectively. We have recently shown that incorporating covariance matrix adaptati...

Full description

Bibliographic Details
Main Authors: Stulp Freek, Oudeyer Pierre-Yves
Format: Article
Language:English
Published: De Gruyter 2012-09-01
Series:Paladyn: Journal of Behavioral Robotics
Subjects:
Online Access:https://doi.org/10.2478/s13230-013-0108-6
id doaj-3e4fb18322424b8584bbd8792bde75bb
record_format Article
spelling doaj-3e4fb18322424b8584bbd8792bde75bb2021-10-02T17:48:15ZengDe GruyterPaladyn: Journal of Behavioral Robotics2081-48362012-09-013312813510.2478/s13230-013-0108-6Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor LearningStulp Freek0Oudeyer Pierre-Yves1 Robotics and Computer Vision, ENSTA-ParisTech, Paris, France Robotics and Computer Vision, ENSTA-ParisTech, Paris, FranceThe “Policy Improvement with Path Integrals” (PI2) [25] and “Covariance Matrix Adaptation - Evolutionary Strategy” [8] are considered to be state-of-the-art in direct reinforcement learning and stochastic optimization respectively. We have recently shown that incorporating covariance matrix adaptation into PI2 – which yields the PICMA2 algorithm – enables adaptive exploration by continually and autonomously reconsidering the exploration/exploitation trade-off. In this article, we provide an overview of our recent work on covariance matrix adaptation for direct reinforcement learning [22–24], highlight its relevance to developmental robotics, and conduct further experiments to analyze the results. We investigate two complementary phenomena from developmental robotics. First, we demonstrate PICMA2’s ability to adapt to slowly or abruptly changing tasks due to its continual and adaptive exploration. This is an important component of life-long skill learning in dynamic environments. Second, we show on a reaching task PICMA2 how subsequently releases degrees of freedom from proximal to more distal limbs as learning progresses. A similar effect is observed in human development, where it is known as ‘proximodistal maturation’.https://doi.org/10.2478/s13230-013-0108-6reinforcement learningcovariance matrix adaptationdevelopmental roboticsadaptive explorationproximodistal maturation
collection DOAJ
language English
format Article
sources DOAJ
author Stulp Freek
Oudeyer Pierre-Yves
spellingShingle Stulp Freek
Oudeyer Pierre-Yves
Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
Paladyn: Journal of Behavioral Robotics
reinforcement learning
covariance matrix adaptation
developmental robotics
adaptive exploration
proximodistal maturation
author_facet Stulp Freek
Oudeyer Pierre-Yves
author_sort Stulp Freek
title Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
title_short Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
title_full Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
title_fullStr Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
title_full_unstemmed Adaptive Exploration through Covariance Matrix Adaptation Enables Developmental Motor Learning
title_sort adaptive exploration through covariance matrix adaptation enables developmental motor learning
publisher De Gruyter
series Paladyn: Journal of Behavioral Robotics
issn 2081-4836
publishDate 2012-09-01
description The “Policy Improvement with Path Integrals” (PI2) [25] and “Covariance Matrix Adaptation - Evolutionary Strategy” [8] are considered to be state-of-the-art in direct reinforcement learning and stochastic optimization respectively. We have recently shown that incorporating covariance matrix adaptation into PI2 – which yields the PICMA2 algorithm – enables adaptive exploration by continually and autonomously reconsidering the exploration/exploitation trade-off. In this article, we provide an overview of our recent work on covariance matrix adaptation for direct reinforcement learning [22–24], highlight its relevance to developmental robotics, and conduct further experiments to analyze the results. We investigate two complementary phenomena from developmental robotics. First, we demonstrate PICMA2’s ability to adapt to slowly or abruptly changing tasks due to its continual and adaptive exploration. This is an important component of life-long skill learning in dynamic environments. Second, we show on a reaching task PICMA2 how subsequently releases degrees of freedom from proximal to more distal limbs as learning progresses. A similar effect is observed in human development, where it is known as ‘proximodistal maturation’.
topic reinforcement learning
covariance matrix adaptation
developmental robotics
adaptive exploration
proximodistal maturation
url https://doi.org/10.2478/s13230-013-0108-6
work_keys_str_mv AT stulpfreek adaptiveexplorationthroughcovariancematrixadaptationenablesdevelopmentalmotorlearning
AT oudeyerpierreyves adaptiveexplorationthroughcovariancematrixadaptationenablesdevelopmentalmotorlearning
_version_ 1716850460536602624