Latent Space Reinforcement Learning
Often we have to handle high dimensional spaces if we want to learn motor skills for robots. In policy search tasks we have to find several parameters to learn a desired movement. This high dimensionality in parameters can be challenging for reinforcement algorithms, since more samples for finding a...
Main Author: | |
---|---|
Format: | Others |
Language: | German en |
Published: |
2014
|
Online Access: | https://tuprints.ulb.tu-darmstadt.de/3832/1/Latent_Space_Reinforcement_Learning_KSLuck.pdf Luck, Kevin Sebastian <http://tuprints.ulb.tu-darmstadt.de/view/person/Luck=3AKevin_Sebastian=3A=3A.html> (2014): Latent Space Reinforcement Learning.Darmstadt, Technische Universität, [Bachelor Thesis] |
id |
ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-3832 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-tu-darmstadt.de-oai-tuprints.ulb.tu-darmstadt.de-38322020-07-15T07:09:31Z http://tuprints.ulb.tu-darmstadt.de/3832/ Latent Space Reinforcement Learning Luck, Kevin Sebastian Often we have to handle high dimensional spaces if we want to learn motor skills for robots. In policy search tasks we have to find several parameters to learn a desired movement. This high dimensionality in parameters can be challenging for reinforcement algorithms, since more samples for finding an optimal solution are needed with every additional dimension. On the other hand, if the robot has a high number of actuators, an inherent correlation between these can be found for a specific motor task, which we can exploit for a faster convergence. One possibility is to use techniques to reduce the dimensionality of the space, which is used as a pre-processing step or as an independent process in most applications. In this thesis we present a novel algorithm which combines the theory of policy search and probabilistic dimensionality reduction to uncover the hidden structure of high dimensional action spaces. Evaluations on an inverse kinematics task indicate that the presented algorithm is able to outperform the reference algorithms PoWER and CMA-ES, especially in high dimensional spaces. Furthermore we evaluate our algorithm on a real-world task. In this task, a NAO robot learns to lift his leg while keeping balance. The issue of collecting samples for learning on a real robot in such a task, which is often very time and cost consuming, is considered in here by using a small number of samples in each iteration. 2014-05-06 Bachelor Thesis NonPeerReviewed text ger CC-BY 2.5 de - Creative Commons, Attribution https://tuprints.ulb.tu-darmstadt.de/3832/1/Latent_Space_Reinforcement_Learning_KSLuck.pdf Luck, Kevin Sebastian <http://tuprints.ulb.tu-darmstadt.de/view/person/Luck=3AKevin_Sebastian=3A=3A.html> (2014): Latent Space Reinforcement Learning.Darmstadt, Technische Universität, [Bachelor Thesis] en info:eu-repo/semantics/bachelorThesis info:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
German en |
format |
Others
|
sources |
NDLTD |
description |
Often we have to handle high dimensional spaces if we want to learn motor skills for robots. In policy search tasks we have to find several parameters to learn a desired movement. This high dimensionality in parameters can be challenging for reinforcement algorithms, since more samples for finding an optimal solution are needed with every additional dimension. On the other hand, if the robot has a high number of actuators, an inherent correlation between these can be found for a specific motor task, which we can exploit for a faster convergence.
One possibility is to use techniques to reduce the dimensionality of the space, which is used as a pre-processing step or as an independent process in most applications. In this thesis we present a novel algorithm which combines the theory of policy search and probabilistic dimensionality reduction to uncover the hidden structure of high dimensional action spaces. Evaluations on an inverse kinematics task indicate that the presented algorithm is able
to outperform the reference algorithms PoWER and CMA-ES, especially in high dimensional spaces. Furthermore we evaluate our algorithm on a real-world task. In this task, a NAO robot learns to lift his leg while keeping balance. The issue of collecting samples for learning on a real robot in such a task, which is often very time and
cost consuming, is considered in here by using a small number of samples in each iteration. |
author |
Luck, Kevin Sebastian |
spellingShingle |
Luck, Kevin Sebastian Latent Space Reinforcement Learning |
author_facet |
Luck, Kevin Sebastian |
author_sort |
Luck, Kevin Sebastian |
title |
Latent Space Reinforcement Learning |
title_short |
Latent Space Reinforcement Learning |
title_full |
Latent Space Reinforcement Learning |
title_fullStr |
Latent Space Reinforcement Learning |
title_full_unstemmed |
Latent Space Reinforcement Learning |
title_sort |
latent space reinforcement learning |
publishDate |
2014 |
url |
https://tuprints.ulb.tu-darmstadt.de/3832/1/Latent_Space_Reinforcement_Learning_KSLuck.pdf Luck, Kevin Sebastian <http://tuprints.ulb.tu-darmstadt.de/view/person/Luck=3AKevin_Sebastian=3A=3A.html> (2014): Latent Space Reinforcement Learning.Darmstadt, Technische Universität, [Bachelor Thesis] |
work_keys_str_mv |
AT luckkevinsebastian latentspacereinforcementlearning |
_version_ |
1719327070733991936 |