Developing locomotion skills with deep reinforcement learning

While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behavio...

Full description

Bibliographic Details
Main Author: Peng, Xue Bin
Language:English
Published: University of British Columbia 2017
Online Access:http://hdl.handle.net/2429/61370
id ndltd-UBC-oai-circle.library.ubc.ca-2429-61370
record_format oai_dc
spelling ndltd-UBC-oai-circle.library.ubc.ca-2429-613702018-01-05T17:29:43Z Developing locomotion skills with deep reinforcement learning Peng, Xue Bin While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments. We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. Science, Faculty of Computer Science, Department of Graduate 2017-04-24T21:43:07Z 2017-04-24T21:43:07Z 2017 2017-09 Text Thesis/Dissertation http://hdl.handle.net/2429/61370 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection NDLTD
language English
sources NDLTD
description While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments. We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. === Science, Faculty of === Computer Science, Department of === Graduate
author Peng, Xue Bin
spellingShingle Peng, Xue Bin
Developing locomotion skills with deep reinforcement learning
author_facet Peng, Xue Bin
author_sort Peng, Xue Bin
title Developing locomotion skills with deep reinforcement learning
title_short Developing locomotion skills with deep reinforcement learning
title_full Developing locomotion skills with deep reinforcement learning
title_fullStr Developing locomotion skills with deep reinforcement learning
title_full_unstemmed Developing locomotion skills with deep reinforcement learning
title_sort developing locomotion skills with deep reinforcement learning
publisher University of British Columbia
publishDate 2017
url http://hdl.handle.net/2429/61370
work_keys_str_mv AT pengxuebin developinglocomotionskillswithdeepreinforcementlearning
_version_ 1718585755561885696