Selective network discovery via deep reinforcement learning on embedded spaces

Abstract Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a r...

Full description

Bibliographic Details
Main Authors: Morales, Peter (Author), Caceres, Rajmonda S (Author), Eliassi-Rad, Tina (Author)
Format: Article
Language:English
Published: Springer International Publishing, 2021-09-20T17:41:47Z.
Subjects:
Online Access:Get fulltext
LEADER 01966 am a22001573u 4500
001 132068
042 |a dc 
100 1 0 |a Morales, Peter  |e author 
700 1 0 |a Caceres, Rajmonda S  |e author 
700 1 0 |a Eliassi-Rad, Tina  |e author 
245 0 0 |a Selective network discovery via deep reinforcement learning on embedded spaces 
260 |b Springer International Publishing,   |c 2021-09-20T17:41:47Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/132068 
520 |a Abstract Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream learning tasks given resource collection constraints are of great interest. In this paper, we formulate the task-specific network discovery problem as a sequential decision-making problem. Our downstream task is selective harvesting, the optimal collection of vertices with a particular attribute. We propose a framework, called network actor critic (NAC), which learns a policy and notion of future reward in an offline setting via a deep reinforcement learning algorithm. The NAC paradigm utilizes a task-specific network embedding to reduce the state space complexity. A detailed comparative analysis of popular network embeddings is presented with respect to their role in supporting offline planning. Furthermore, a quantitative study is presented on various synthetic and real benchmarks using NAC and several baselines. We show that offline models of reward and network discovery policies lead to significantly improved performance when compared to competitive online discovery algorithms. Finally, we outline learning regimes where planning is critical in addressing sparse and changing reward signals. 
546 |a en 
655 7 |a Article