Developing locomotion skills with deep reinforcement learning
While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behavio...
Main Author: | |
---|---|
Language: | English |
Published: |
University of British Columbia
2017
|
Online Access: | http://hdl.handle.net/2429/61370 |
id |
ndltd-UBC-oai-circle.library.ubc.ca-2429-61370 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UBC-oai-circle.library.ubc.ca-2429-613702018-01-05T17:29:43Z Developing locomotion skills with deep reinforcement learning Peng, Xue Bin While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments. We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. Science, Faculty of Computer Science, Department of Graduate 2017-04-24T21:43:07Z 2017-04-24T21:43:07Z 2017 2017-09 Text Thesis/Dissertation http://hdl.handle.net/2429/61370 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
description |
While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments.
We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. === Science, Faculty of === Computer Science, Department of === Graduate |
author |
Peng, Xue Bin |
spellingShingle |
Peng, Xue Bin Developing locomotion skills with deep reinforcement learning |
author_facet |
Peng, Xue Bin |
author_sort |
Peng, Xue Bin |
title |
Developing locomotion skills with deep reinforcement learning |
title_short |
Developing locomotion skills with deep reinforcement learning |
title_full |
Developing locomotion skills with deep reinforcement learning |
title_fullStr |
Developing locomotion skills with deep reinforcement learning |
title_full_unstemmed |
Developing locomotion skills with deep reinforcement learning |
title_sort |
developing locomotion skills with deep reinforcement learning |
publisher |
University of British Columbia |
publishDate |
2017 |
url |
http://hdl.handle.net/2429/61370 |
work_keys_str_mv |
AT pengxuebin developinglocomotionskillswithdeepreinforcementlearning |
_version_ |
1718585755561885696 |