Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN

This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is use...

Full description

Bibliographic Details
Main Authors:	Xiaoli He, Hong Jiang, Yu Song, Chunlin He, He Xiao
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Routing selection multi-hop CRN energy harvesting Q learning reinforcement learning MDP
Online Access:	https://ieeexplore.ieee.org/document/8697342/

id	doaj-5ac8d501225645da916ab55d333e1d42
record_format	Article
spelling	doaj-5ac8d501225645da916ab55d333e1d422021-03-29T22:41:12ZengIEEEIEEE Access2169-35362019-01-017544355444810.1109/ACCESS.2019.29129968697342Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRNXiaoli He0https://orcid.org/0000-0002-8028-479XHong Jiang1Yu Song2Chunlin He3He Xiao4School of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaSchool of Computer Science, China West Normal University, Nanchong, ChinaSchool of Information Engineering, South West University of Science and Technology, Mianyang, ChinaThis paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.https://ieeexplore.ieee.org/document/8697342/Routing selectionmulti-hop CRNenergy harvestingQ learningreinforcement learningMDP
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao
spellingShingle	Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN IEEE Access Routing selection multi-hop CRN energy harvesting Q learning reinforcement learning MDP
author_facet	Xiaoli He Hong Jiang Yu Song Chunlin He He Xiao
author_sort	Xiaoli He
title	Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_short	Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_full	Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_fullStr	Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_full_unstemmed	Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN
title_sort	routing selection with reinforcement learning for energy harvesting multi-hop crn
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates α and discount factors γ. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.
topic	Routing selection multi-hop CRN energy harvesting Q learning reinforcement learning MDP
url	https://ieeexplore.ieee.org/document/8697342/
work_keys_str_mv	AT xiaolihe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT hongjiang routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT yusong routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT chunlinhe routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn AT hexiao routingselectionwithreinforcementlearningforenergyharvestingmultihopcrn
_version_	1724191032854183936

Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN

Similar Items