Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is use...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8697342/ |
id |
doaj-5ac8d501225645da916ab55d333e1d42 |
---|---|
record_format |
Article |
spelling |
doaj-5ac8d501225645da916ab55d333e1d422021-03-29T22:41:12ZengIEEEIEEE Access2169-35362019-01-017544355444810.1109/ACCESS.2019.29129968697342Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRNXiaoli He0https://orcid.org/0000-0002-8028-479XHong Jiang1Yu Song2Chunlin He3He Xiao4School of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Computer Science, China West Normal University, Nanchong, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaThis paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.https://ieeexplore.ieee.org/document/8697342/Routing selectionmulti-hop CRNenergy harvestingQ learningreinforcement learningMDP |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao |
spellingShingle |
Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN IEEE Access Routing selection multi-hop CRN energy harvesting Q learning reinforcement learning MDP |
author_facet |
Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao |
author_sort |
Xiaoli He |
title |
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN |
title_short |
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN |
title_full |
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN |
title_fullStr |
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN |
title_full_unstemmed |
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN |
title_sort |
routing selection with reinforcement learning for energy harvesting multi-hop crn |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms. |
topic |
Routing selection multi-hop CRN energy harvesting Q learning reinforcement learning MDP |
url |
https://ieeexplore.ieee.org/document/8697342/ |
work_keys_str_mv |
AT xiaolihe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT hongjiang routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT yusong routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT chunlinhe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT hexiao routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn |
_version_ |
1724191032854183936 |