Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization

Graphs in real-world applications are dynamic both in terms of structures and inputs. Information discovery in such networks, which present dense and deeply connected patterns locally and sparsity globally can be time consuming and computationally costly. In this paper we address the shortest path q...

Full description

Bibliographic Details
Main Authors:	Samuel Henrique Silva, Adel Alaeddini, Peyman Najafirad
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Machine learning graphs Markov-chain deep reinforcement learning path-planning
Online Access:	https://ieeexplore.ieee.org/document/9055369/

id	doaj-a4430e285c9848deb0a781043a53ee8e
record_format	Article
spelling	doaj-a4430e285c9848deb0a781043a53ee8e2021-03-30T01:32:10ZengIEEEIEEE Access2169-35362020-01-018639106392210.1109/ACCESS.2020.29852959055369Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy OptimizationSamuel Henrique Silva0https://orcid.org/0000-0003-0368-181XAdel Alaeddini1https://orcid.org/0000-0003-4451-3150Peyman Najafirad2https://orcid.org/0000-0001-9671-577XSecure AI and Autonomy Laboratory, The University of Texas at San Antonio, San Antonio, TX, USADepartment of Information Systems and Cyber Security, The University of Texas at San Antonio, San Antonio, TX, USASecure AI and Autonomy Laboratory, The University of Texas at San Antonio, San Antonio, TX, USAGraphs in real-world applications are dynamic both in terms of structures and inputs. Information discovery in such networks, which present dense and deeply connected patterns locally and sparsity globally can be time consuming and computationally costly. In this paper we address the shortest path query in spatio-temporal graphs which is a fundamental graph problem with numerous applications. In spatio-temporal graphs, shortest path query classical algorithms are insufficient or even flawed because information consistency can not be guaranteed between two timestamps and path recalculation is computationally costly. In this work, we address the complexity and dynamicity of the shortest path query in spatio-temporal graphs with a simple, yet effective model based on Reinforcement Learning with Proximal Policy Optimization. Our solution simplifies the problem by decomposing the spatio-temporal graph in two components: a static and a dynamic sub-graph. The static graph, known and immutable, is efficiently solved with A* algorithm. The sub-graphs interconnecting the static graph have unknown dynamics and we address such issue by estimating the unknown dynamic portion of the graph as a Markov Chain which correlates the observations of the agents in the environment and the path to be followed. We then derive an action policy through Proximal Policy Optimization to select the local optimal actions in the Markov Process that will lead to the shortest path, given the estimated system dynamics. We evaluate the system in a simulation environment constructed in Unity3D. In partially structured and unknown environments, with variable environment parameters we've obtained an efficiency 75% greater than the comparable deterministic solution.https://ieeexplore.ieee.org/document/9055369/Machine learninggraphsMarkov-chaindeep reinforcement learningpath-planning
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Samuel Henrique Silva Adel Alaeddini Peyman Najafirad
spellingShingle	Samuel Henrique Silva Adel Alaeddini Peyman Najafirad Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization IEEE Access Machine learning graphs Markov-chain deep reinforcement learning path-planning
author_facet	Samuel Henrique Silva Adel Alaeddini Peyman Najafirad
author_sort	Samuel Henrique Silva
title	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization
title_short	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization
title_full	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization
title_fullStr	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization
title_full_unstemmed	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization
title_sort	temporal graph traversals using reinforcement learning with proximal policy optimization
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2020-01-01
description	Graphs in real-world applications are dynamic both in terms of structures and inputs. Information discovery in such networks, which present dense and deeply connected patterns locally and sparsity globally can be time consuming and computationally costly. In this paper we address the shortest path query in spatio-temporal graphs which is a fundamental graph problem with numerous applications. In spatio-temporal graphs, shortest path query classical algorithms are insufficient or even flawed because information consistency can not be guaranteed between two timestamps and path recalculation is computationally costly. In this work, we address the complexity and dynamicity of the shortest path query in spatio-temporal graphs with a simple, yet effective model based on Reinforcement Learning with Proximal Policy Optimization. Our solution simplifies the problem by decomposing the spatio-temporal graph in two components: a static and a dynamic sub-graph. The static graph, known and immutable, is efficiently solved with A* algorithm. The sub-graphs interconnecting the static graph have unknown dynamics and we address such issue by estimating the unknown dynamic portion of the graph as a Markov Chain which correlates the observations of the agents in the environment and the path to be followed. We then derive an action policy through Proximal Policy Optimization to select the local optimal actions in the Markov Process that will lead to the shortest path, given the estimated system dynamics. We evaluate the system in a simulation environment constructed in Unity3D. In partially structured and unknown environments, with variable environment parameters we've obtained an efficiency 75% greater than the comparable deterministic solution.
topic	Machine learning graphs Markov-chain deep reinforcement learning path-planning
url	https://ieeexplore.ieee.org/document/9055369/
work_keys_str_mv	AT samuelhenriquesilva temporalgraphtraversalsusingreinforcementlearningwithproximalpolicyoptimization AT adelalaeddini temporalgraphtraversalsusingreinforcementlearningwithproximalpolicyoptimization AT peymannajafirad temporalgraphtraversalsusingreinforcementlearningwithproximalpolicyoptimization
_version_	1724186930022711296

Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization

Similar Items