Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN

This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is use...

Full description

Bibliographic Details
Main Authors: Xiaoli He, Hong Jiang, Yu Song, Chunlin He, He Xiao
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
MDP
Online Access:https://ieeexplore.ieee.org/document/8697342/
id doaj-5ac8d501225645da916ab55d333e1d42
record_format Article
spelling doaj-5ac8d501225645da916ab55d333e1d422021-03-29T22:41:12ZengIEEEIEEE Access2169-35362019-01-017544355444810.1109/ACCESS.2019.29129968697342Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRNXiaoli He0https://orcid.org/0000-0002-8028-479XHong Jiang1Yu Song2Chunlin He3He Xiao4School of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Computer Science, China West Normal University, Nanchong, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaThis paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.https://ieeexplore.ieee.org/document/8697342/Routing selectionmulti-hop CRNenergy harvestingQ learningreinforcement learningMDP
collection DOAJ
language English
format Article
sources DOAJ
author Xiaoli He
Hong Jiang
Yu Song
Chunlin He
He Xiao
spellingShingle Xiaoli He
Hong Jiang
Yu Song
Chunlin He
He Xiao
Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
IEEE Access
Routing selection
multi-hop CRN
energy harvesting
Q learning
reinforcement learning
MDP
author_facet Xiaoli He
Hong Jiang
Yu Song
Chunlin He
He Xiao
author_sort Xiaoli He
title Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_short Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_full Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_fullStr Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_full_unstemmed Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_sort routing selection with reinforcement learning for energy harvesting multi-hop crn
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.
topic Routing selection
multi-hop CRN
energy harvesting
Q learning
reinforcement learning
MDP
url https://ieeexplore.ieee.org/document/8697342/
work_keys_str_mv AT xiaolihe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
AT hongjiang routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
AT yusong routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
AT chunlinhe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
AT hexiao routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
_version_ 1724191032854183936