Task Offloading and Resource Allocation Using Deep Reinforcement Learning
Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitor...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | en |
Published: |
Université d'Ottawa / University of Ottawa
2020
|
Subjects: | |
Online Access: | http://hdl.handle.net/10393/41525 http://dx.doi.org/10.20381/ruor-25749 |
id |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-41525 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-415252020-12-02T05:26:41Z Task Offloading and Resource Allocation Using Deep Reinforcement Learning Zhang, Kaiyi Samaan, Nancy A. Offloading Deep reinforcement learning Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitoring, enhanced road safety and city resources metering and management. These applications rely on a number of energy constrained MIoT units (MUs) (e.g., robots and drones) to continuously sense, capture and process data and images from their environments to produce immediate adaptive actions (e.g., triggering alarms, controlling machinery and communicating with citizens). In this thesis, we consider a scenario where a battery constrained MU executes a number of time-sensitive data processing tasks whose arrival times and sizes are stochastic in nature. These tasks can be executed locally on the device, offloaded to one of the nearby edge servers or to a cloud data center within a mobile edge computing (MEC) infrastructure. We first formulate the problem of making optimal offloading decisions that minimize the cost of current and future tasks as a constrained Markov decision process (CMDP) that accounts for the constraints of the MU battery and the limited reserved resources on the MEC infrastructure by the application providers. Then, we relax the CMDP problem into regular Markov decision process (MDP) using Lagrangian primal-dual optimization. We then develop advantage actor-critic (A2C) algorithm, one of the model-free deep reinforcement learning (DRL) method to train the MU to solve the relaxed problem. The training of the MU can be carried-out once to learn optimal offloading policies that are repeatedly employed as long as there are no large changes in the MU environment. Simulation results are presented to show that the proposed algorithm can achieve performance improvement over offloading decisions schemes that aim at optimizing instantaneous costs. 2020-12-01T13:43:47Z 2020-12-01T13:43:47Z 2020-12-01 Thesis http://hdl.handle.net/10393/41525 http://dx.doi.org/10.20381/ruor-25749 en application/pdf Université d'Ottawa / University of Ottawa |
collection |
NDLTD |
language |
en |
format |
Others
|
sources |
NDLTD |
topic |
Offloading Deep reinforcement learning |
spellingShingle |
Offloading Deep reinforcement learning Zhang, Kaiyi Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
description |
Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitoring, enhanced road safety and city resources metering and management. These applications rely on a number of energy constrained MIoT units (MUs) (e.g., robots and drones) to continuously sense, capture and process data and images from their environments to produce immediate adaptive actions (e.g., triggering alarms, controlling machinery and communicating with citizens). In this thesis, we consider a scenario where a battery constrained MU executes a number of time-sensitive data processing tasks whose arrival times and sizes are stochastic in nature. These tasks can be executed locally on the device, offloaded to one of the nearby edge servers or to a cloud data center within a mobile edge computing (MEC) infrastructure. We first formulate the problem of making optimal offloading decisions that minimize the cost of current and future tasks as a constrained Markov decision process (CMDP) that accounts for the constraints of the MU battery and the limited reserved resources on the MEC infrastructure by the application providers. Then, we relax the CMDP problem into regular Markov decision process (MDP) using Lagrangian primal-dual optimization. We then develop advantage actor-critic (A2C) algorithm, one of the model-free deep reinforcement learning (DRL) method to train the MU to solve the relaxed problem. The training of the MU can be carried-out once to learn optimal offloading policies that are repeatedly employed as long as there are no large changes in the MU environment. Simulation results are presented to show that the proposed algorithm can achieve performance improvement over offloading decisions schemes that aim at optimizing instantaneous costs. |
author2 |
Samaan, Nancy A. |
author_facet |
Samaan, Nancy A. Zhang, Kaiyi |
author |
Zhang, Kaiyi |
author_sort |
Zhang, Kaiyi |
title |
Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
title_short |
Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
title_full |
Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
title_fullStr |
Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
title_full_unstemmed |
Task Offloading and Resource Allocation Using Deep Reinforcement Learning |
title_sort |
task offloading and resource allocation using deep reinforcement learning |
publisher |
Université d'Ottawa / University of Ottawa |
publishDate |
2020 |
url |
http://hdl.handle.net/10393/41525 http://dx.doi.org/10.20381/ruor-25749 |
work_keys_str_mv |
AT zhangkaiyi taskoffloadingandresourceallocationusingdeepreinforcementlearning |
_version_ |
1719363368323645440 |