D2D Resource Allocation Based on Reinforcement Learning with Power Control

碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 107 === Device-to-device (D2D) communication is defined as the direct communication between two user equipment devices (DUE) without traversing the base station of an LTE network. With the underlay mode of resource reusing, DUEs are allocated with resource blocks (RBs...

Full description

Bibliographic Details
Main Authors: LIN, WEN-JUN, 林玟均
Other Authors: WANG, HWANG-CHENG
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/cyxbrm
Description
Summary:碩士 === 國立宜蘭大學 === 電子工程學系碩士班 === 107 === Device-to-device (D2D) communication is defined as the direct communication between two user equipment devices (DUE) without traversing the base station of an LTE network. With the underlay mode of resource reusing, DUEs are allocated with resource blocks (RBs) that are also used by the cellular users equipment (CUE) within the same coverage area of the base station. In this way, the system throughput is improved by reusing the spectrum. One kind of reinforcement learning (RL) methods for allocating RBs is Multi-Armed Bandit (MAB) algorithm with some versions such as Epsilon-first, Epsilon-greedy, Upper-Confidence-Bound, etc. Because the transmission power of a DUE will affect the interference to the CUE and other DUEs using the same RBs, the system throughput would be affected as a result. In this paper, by considering the power control on DUEs, we study resource allocation policies based on different versions of MAB.