Task Offloading and Resource Allocation Using Deep Reinforcement Learning

Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitor...

Full description

Bibliographic Details
Main Author: Zhang, Kaiyi
Other Authors: Samaan, Nancy A.
Format: Others
Language:en
Published: Université d'Ottawa / University of Ottawa 2020
Subjects:
Online Access:http://hdl.handle.net/10393/41525
http://dx.doi.org/10.20381/ruor-25749
id ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-41525
record_format oai_dc
spelling ndltd-uottawa.ca-oai-ruor.uottawa.ca-10393-415252020-12-02T05:26:41Z Task Offloading and Resource Allocation Using Deep Reinforcement Learning Zhang, Kaiyi Samaan, Nancy A. Offloading Deep reinforcement learning Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitoring, enhanced road safety and city resources metering and management. These applications rely on a number of energy constrained MIoT units (MUs) (e.g., robots and drones) to continuously sense, capture and process data and images from their environments to produce immediate adaptive actions (e.g., triggering alarms, controlling machinery and communicating with citizens). In this thesis, we consider a scenario where a battery constrained MU executes a number of time-sensitive data processing tasks whose arrival times and sizes are stochastic in nature. These tasks can be executed locally on the device, offloaded to one of the nearby edge servers or to a cloud data center within a mobile edge computing (MEC) infrastructure. We first formulate the problem of making optimal offloading decisions that minimize the cost of current and future tasks as a constrained Markov decision process (CMDP) that accounts for the constraints of the MU battery and the limited reserved resources on the MEC infrastructure by the application providers. Then, we relax the CMDP problem into regular Markov decision process (MDP) using Lagrangian primal-dual optimization. We then develop advantage actor-critic (A2C) algorithm, one of the model-free deep reinforcement learning (DRL) method to train the MU to solve the relaxed problem. The training of the MU can be carried-out once to learn optimal offloading policies that are repeatedly employed as long as there are no large changes in the MU environment. Simulation results are presented to show that the proposed algorithm can achieve performance improvement over offloading decisions schemes that aim at optimizing instantaneous costs. 2020-12-01T13:43:47Z 2020-12-01T13:43:47Z 2020-12-01 Thesis http://hdl.handle.net/10393/41525 http://dx.doi.org/10.20381/ruor-25749 en application/pdf Université d'Ottawa / University of Ottawa
collection NDLTD
language en
format Others
sources NDLTD
topic Offloading
Deep reinforcement learning
spellingShingle Offloading
Deep reinforcement learning
Zhang, Kaiyi
Task Offloading and Resource Allocation Using Deep Reinforcement Learning
description Rapid urbanization poses huge challenges to people's daily lives, such as traffic congestion, environmental pollution, and public safety. Mobile Internet of things (MIoT) applications serving smart cities bring the promise of innovative and enhanced public services such as air pollution monitoring, enhanced road safety and city resources metering and management. These applications rely on a number of energy constrained MIoT units (MUs) (e.g., robots and drones) to continuously sense, capture and process data and images from their environments to produce immediate adaptive actions (e.g., triggering alarms, controlling machinery and communicating with citizens). In this thesis, we consider a scenario where a battery constrained MU executes a number of time-sensitive data processing tasks whose arrival times and sizes are stochastic in nature. These tasks can be executed locally on the device, offloaded to one of the nearby edge servers or to a cloud data center within a mobile edge computing (MEC) infrastructure. We first formulate the problem of making optimal offloading decisions that minimize the cost of current and future tasks as a constrained Markov decision process (CMDP) that accounts for the constraints of the MU battery and the limited reserved resources on the MEC infrastructure by the application providers. Then, we relax the CMDP problem into regular Markov decision process (MDP) using Lagrangian primal-dual optimization. We then develop advantage actor-critic (A2C) algorithm, one of the model-free deep reinforcement learning (DRL) method to train the MU to solve the relaxed problem. The training of the MU can be carried-out once to learn optimal offloading policies that are repeatedly employed as long as there are no large changes in the MU environment. Simulation results are presented to show that the proposed algorithm can achieve performance improvement over offloading decisions schemes that aim at optimizing instantaneous costs.
author2 Samaan, Nancy A.
author_facet Samaan, Nancy A.
Zhang, Kaiyi
author Zhang, Kaiyi
author_sort Zhang, Kaiyi
title Task Offloading and Resource Allocation Using Deep Reinforcement Learning
title_short Task Offloading and Resource Allocation Using Deep Reinforcement Learning
title_full Task Offloading and Resource Allocation Using Deep Reinforcement Learning
title_fullStr Task Offloading and Resource Allocation Using Deep Reinforcement Learning
title_full_unstemmed Task Offloading and Resource Allocation Using Deep Reinforcement Learning
title_sort task offloading and resource allocation using deep reinforcement learning
publisher Université d'Ottawa / University of Ottawa
publishDate 2020
url http://hdl.handle.net/10393/41525
http://dx.doi.org/10.20381/ruor-25749
work_keys_str_mv AT zhangkaiyi taskoffloadingandresourceallocationusingdeepreinforcementlearning
_version_ 1719363368323645440