Developing locomotion skills with deep reinforcement learning

While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behavio...

Full description

Bibliographic Details
Main Author:	Peng, Xue Bin
Language:	English
Published:	University of British Columbia 2017
Online Access:	http://hdl.handle.net/2429/61370

id	ndltd-UBC-oai-circle.library.ubc.ca-2429-61370
record_format	oai_dc
spelling	ndltd-UBC-oai-circle.library.ubc.ca-2429-613702018-01-05T17:29:43Z Developing locomotion skills with deep reinforcement learning Peng, Xue Bin While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments. We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. Science, Faculty of Computer Science, Department of Graduate 2017-04-24T21:43:07Z 2017-04-24T21:43:07Z 2017 2017-09 Text Thesis/Dissertation http://hdl.handle.net/2429/61370 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection	NDLTD
language	English
sources	NDLTD
description	While physics-based models for passive phenomena such as cloth and fluids have been widely adopted in computer animation, physics-based character simulation remains a challenging problem. One of the major hurdles for character simulation is that of control, the modeling of a character's behaviour in response to its goals and environment. This challenge is further compounded by the high-dimensional and complex dynamics that often arise from these systems. A popular approach to mitigating these challenges is to build reduced models that capture important properties for a particular task. These models often leverage significant human insight, and may nonetheless overlook important information. In this thesis, we explore the application of deep reinforcement learning (DeepRL) to develop control policies that operate directly using high-dimensional low-level representations, thereby reducing the need for manual feature engineering and enabling characters to perform more challenging tasks in complex environments. We start by presenting a DeepRL framework for developing policies that allow character to agilely traverse across irregular terrain. The policies are represented using a mixture of experts model, which selects from a small collection of parameterized controllers. Our method is demonstrated on planar characters of varying morphologies and different classes of terrain. Through the learning process, the networks develop the appropriate strategies for traveling across various irregular environments without requiring extensive feature engineering. Next, we explore the effects of different action parameterizations on the performance of RL policies. We compare policies trained using low-level actions, such as torques, target velocities, target angles, and muscle activations. Performance is evaluated using a motion imitation benchmark. For our particular task, the choice of higher-level actions that incorporate local feedback, such as target angles, leads to significant improvements in performance and learning speed. Finally, we describe a hierarchical reinforcement learning framework for controlling the motion of a simulated 3D biped. By training each level of the hierarchy to operate at different spatial and temporal scales, the character is able to perform a variety of locomotion tasks that require a balance between short-term and long-term planning. Some of the tasks include soccer dribbling, path following, and navigation across dynamic obstacles. === Science, Faculty of === Computer Science, Department of === Graduate
author	Peng, Xue Bin
spellingShingle	Peng, Xue Bin Developing locomotion skills with deep reinforcement learning
author_facet	Peng, Xue Bin
author_sort	Peng, Xue Bin
title	Developing locomotion skills with deep reinforcement learning
title_short	Developing locomotion skills with deep reinforcement learning
title_full	Developing locomotion skills with deep reinforcement learning
title_fullStr	Developing locomotion skills with deep reinforcement learning
title_full_unstemmed	Developing locomotion skills with deep reinforcement learning
title_sort	developing locomotion skills with deep reinforcement learning
publisher	University of British Columbia
publishDate	2017
url	http://hdl.handle.net/2429/61370
work_keys_str_mv	AT pengxuebin developinglocomotionskillswithdeepreinforcementlearning
_version_	1718585755561885696

Developing locomotion skills with deep reinforcement learning

Similar Items