D2D Resource Allocation Based on Reinforcement Learning with Power Control

碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 107 === Device-to-device (D2D) communication is defined as the direct communication between two user equipment devices (DUE) without traversing the base station of an LTE network. With the underlay mode of resource reusing, DUEs are allocated with resource blocks (RBs...

Full description

Bibliographic Details
Main Authors: LIN, WEN-JUN, 林玟均
Other Authors: WANG, HWANG-CHENG
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/cyxbrm
id ndltd-TW-107NIU00428007
record_format oai_dc
spelling ndltd-TW-107NIU004280072019-08-31T03:47:41Z http://ndltd.ncl.edu.tw/handle/cyxbrm D2D Resource Allocation Based on Reinforcement Learning with Power Control D2D資源分配—以具有功率控制之強化學習為基礎 LIN, WEN-JUN 林玟均 碩士 國立宜蘭大學 電子工程學系碩士班 107 Device-to-device (D2D) communication is defined as the direct communication between two user equipment devices (DUE) without traversing the base station of an LTE network. With the underlay mode of resource reusing, DUEs are allocated with resource blocks (RBs) that are also used by the cellular users equipment (CUE) within the same coverage area of the base station. In this way, the system throughput is improved by reusing the spectrum. One kind of reinforcement learning (RL) methods for allocating RBs is Multi-Armed Bandit (MAB) algorithm with some versions such as Epsilon-first, Epsilon-greedy, Upper-Confidence-Bound, etc. Because the transmission power of a DUE will affect the interference to the CUE and other DUEs using the same RBs, the system throughput would be affected as a result. In this paper, by considering the power control on DUEs, we study resource allocation policies based on different versions of MAB. WANG, HWANG-CHENG 王煌城 2019 學位論文 ; thesis 35 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 107 === Device-to-device (D2D) communication is defined as the direct communication between two user equipment devices (DUE) without traversing the base station of an LTE network. With the underlay mode of resource reusing, DUEs are allocated with resource blocks (RBs) that are also used by the cellular users equipment (CUE) within the same coverage area of the base station. In this way, the system throughput is improved by reusing the spectrum. One kind of reinforcement learning (RL) methods for allocating RBs is Multi-Armed Bandit (MAB) algorithm with some versions such as Epsilon-first, Epsilon-greedy, Upper-Confidence-Bound, etc. Because the transmission power of a DUE will affect the interference to the CUE and other DUEs using the same RBs, the system throughput would be affected as a result. In this paper, by considering the power control on DUEs, we study resource allocation policies based on different versions of MAB.
author2 WANG, HWANG-CHENG
author_facet WANG, HWANG-CHENG
LIN, WEN-JUN
林玟均
author LIN, WEN-JUN
林玟均
spellingShingle LIN, WEN-JUN
林玟均
D2D Resource Allocation Based on Reinforcement Learning with Power Control
author_sort LIN, WEN-JUN
title D2D Resource Allocation Based on Reinforcement Learning with Power Control
title_short D2D Resource Allocation Based on Reinforcement Learning with Power Control
title_full D2D Resource Allocation Based on Reinforcement Learning with Power Control
title_fullStr D2D Resource Allocation Based on Reinforcement Learning with Power Control
title_full_unstemmed D2D Resource Allocation Based on Reinforcement Learning with Power Control
title_sort d2d resource allocation based on reinforcement learning with power control
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/cyxbrm
work_keys_str_mv AT linwenjun d2dresourceallocationbasedonreinforcementlearningwithpowercontrol
AT línwénjūn d2dresourceallocationbasedonreinforcementlearningwithpowercontrol
AT linwenjun d2dzīyuánfēnpèiyǐjùyǒugōnglǜkòngzhìzhīqiánghuàxuéxíwèijīchǔ
AT línwénjūn d2dzīyuánfēnpèiyǐjùyǒugōnglǜkòngzhìzhīqiánghuàxuéxíwèijīchǔ
_version_ 1719241759920226304