Risk-Aware Model-Based Control

Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, cal...

Full description

Bibliographic Details
Main Authors: Chen Yu, Andre Rosendo
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-03-01
Series:Frontiers in Robotics and AI
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/full
id doaj-5b1863aaa0c640829a4dd8bb055c93ff
record_format Article
spelling doaj-5b1863aaa0c640829a4dd8bb055c93ff2021-03-11T04:41:15ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442021-03-01810.3389/frobt.2021.617839617839Risk-Aware Model-Based ControlChen YuAndre RosendoModel-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). It combines uncertainty-aware deep dynamics models and the risk assessment technique Conditional Value at Risk (CVaR). This mechanism is appropriate for real-world application since it takes epistemic risk into consideration. In addition, we use a model-free solver to produce warm-up training data, and this setting improves the performance in low-dimensional environments and covers the shortage of MBRL’s nature in the high-dimensional scenarios. In comparison with other state-of-the-art reinforcement learning algorithms, we show that it produces superior results on a walking robot model. We also evaluate the method with an Eidos environment, which is a novel experimental method with multi-dimensional randomly initialized deep neural networks to measure the performance of any reinforcement learning algorithm, and the advantages of RAMCO are highlighted.https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/fullmachine learningreinforcement learningdynamics modelrisk awarenessconditional value at riskdata efficiency
collection DOAJ
language English
format Article
sources DOAJ
author Chen Yu
Andre Rosendo
spellingShingle Chen Yu
Andre Rosendo
Risk-Aware Model-Based Control
Frontiers in Robotics and AI
machine learning
reinforcement learning
dynamics model
risk awareness
conditional value at risk
data efficiency
author_facet Chen Yu
Andre Rosendo
author_sort Chen Yu
title Risk-Aware Model-Based Control
title_short Risk-Aware Model-Based Control
title_full Risk-Aware Model-Based Control
title_fullStr Risk-Aware Model-Based Control
title_full_unstemmed Risk-Aware Model-Based Control
title_sort risk-aware model-based control
publisher Frontiers Media S.A.
series Frontiers in Robotics and AI
issn 2296-9144
publishDate 2021-03-01
description Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). It combines uncertainty-aware deep dynamics models and the risk assessment technique Conditional Value at Risk (CVaR). This mechanism is appropriate for real-world application since it takes epistemic risk into consideration. In addition, we use a model-free solver to produce warm-up training data, and this setting improves the performance in low-dimensional environments and covers the shortage of MBRL’s nature in the high-dimensional scenarios. In comparison with other state-of-the-art reinforcement learning algorithms, we show that it produces superior results on a walking robot model. We also evaluate the method with an Eidos environment, which is a novel experimental method with multi-dimensional randomly initialized deep neural networks to measure the performance of any reinforcement learning algorithm, and the advantages of RAMCO are highlighted.
topic machine learning
reinforcement learning
dynamics model
risk awareness
conditional value at risk
data efficiency
url https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/full
work_keys_str_mv AT chenyu riskawaremodelbasedcontrol
AT andrerosendo riskawaremodelbasedcontrol
_version_ 1724226016019218432