Risk-Aware Model-Based Control
Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, cal...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-03-01
|
Series: | Frontiers in Robotics and AI |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/full |
id |
doaj-5b1863aaa0c640829a4dd8bb055c93ff |
---|---|
record_format |
Article |
spelling |
doaj-5b1863aaa0c640829a4dd8bb055c93ff2021-03-11T04:41:15ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442021-03-01810.3389/frobt.2021.617839617839Risk-Aware Model-Based ControlChen YuAndre RosendoModel-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). It combines uncertainty-aware deep dynamics models and the risk assessment technique Conditional Value at Risk (CVaR). This mechanism is appropriate for real-world application since it takes epistemic risk into consideration. In addition, we use a model-free solver to produce warm-up training data, and this setting improves the performance in low-dimensional environments and covers the shortage of MBRL’s nature in the high-dimensional scenarios. In comparison with other state-of-the-art reinforcement learning algorithms, we show that it produces superior results on a walking robot model. We also evaluate the method with an Eidos environment, which is a novel experimental method with multi-dimensional randomly initialized deep neural networks to measure the performance of any reinforcement learning algorithm, and the advantages of RAMCO are highlighted.https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/fullmachine learningreinforcement learningdynamics modelrisk awarenessconditional value at riskdata efficiency |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chen Yu Andre Rosendo |
spellingShingle |
Chen Yu Andre Rosendo Risk-Aware Model-Based Control Frontiers in Robotics and AI machine learning reinforcement learning dynamics model risk awareness conditional value at risk data efficiency |
author_facet |
Chen Yu Andre Rosendo |
author_sort |
Chen Yu |
title |
Risk-Aware Model-Based Control |
title_short |
Risk-Aware Model-Based Control |
title_full |
Risk-Aware Model-Based Control |
title_fullStr |
Risk-Aware Model-Based Control |
title_full_unstemmed |
Risk-Aware Model-Based Control |
title_sort |
risk-aware model-based control |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Robotics and AI |
issn |
2296-9144 |
publishDate |
2021-03-01 |
description |
Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). It combines uncertainty-aware deep dynamics models and the risk assessment technique Conditional Value at Risk (CVaR). This mechanism is appropriate for real-world application since it takes epistemic risk into consideration. In addition, we use a model-free solver to produce warm-up training data, and this setting improves the performance in low-dimensional environments and covers the shortage of MBRL’s nature in the high-dimensional scenarios. In comparison with other state-of-the-art reinforcement learning algorithms, we show that it produces superior results on a walking robot model. We also evaluate the method with an Eidos environment, which is a novel experimental method with multi-dimensional randomly initialized deep neural networks to measure the performance of any reinforcement learning algorithm, and the advantages of RAMCO are highlighted. |
topic |
machine learning reinforcement learning dynamics model risk awareness conditional value at risk data efficiency |
url |
https://www.frontiersin.org/articles/10.3389/frobt.2021.617839/full |
work_keys_str_mv |
AT chenyu riskawaremodelbasedcontrol AT andrerosendo riskawaremodelbasedcontrol |
_version_ |
1724226016019218432 |