Understanding the limitations of network online learning
Abstract Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2020-09-01
|
Series: | Applied Network Science |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41109-020-00296-w |
id |
doaj-9074238bf59f47f6a7a59fc45b0d0716 |
---|---|
record_format |
Article |
spelling |
doaj-9074238bf59f47f6a7a59fc45b0d07162020-11-25T02:49:29ZengSpringerOpenApplied Network Science2364-82282020-09-015112510.1007/s41109-020-00296-wUnderstanding the limitations of network online learningTimothy LaRock0Timothy Sakharov1Sahely Bhadra2Tina Eliassi-Rad3Network Science Institute, Northeastern University, 360 Huntington AveNetwork Science Institute, Northeastern University, 360 Huntington AveDepartment of Computer Science and Engineering, Indian Institute of Technology PalakkadNetwork Science Institute, Northeastern University, 360 Huntington AveAbstract Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading results. In this paper, we investigate limitations of learning to complete partially observed networks via node querying. Concretely, we study the following problem: given (i) a partially observed network, (ii) the ability to query nodes for their connections (e.g., by accessing an API), and (iii) a budget on the number of such queries, sequentially learn which nodes to query in order to maximally increase observability. We call this querying process Network Online Learning and present a family of algorithms called NOL*. These algorithms learn to choose which partially observed node to query next based on a parameterized model that is trained online through a process of exploration and exploitation. Extensive experiments on both synthetic and real world networks show that (i) it is possible to sequentially learn to choose which nodes are best to query in a network and (ii) some macroscopic properties of networks, such as the degree distribution and modular structure, impact the potential for learning and the optimal amount of random exploration.http://link.springer.com/article/10.1007/s41109-020-00296-wPartially observed networksOnline learningHeavy-tailed target distributions |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Timothy LaRock Timothy Sakharov Sahely Bhadra Tina Eliassi-Rad |
spellingShingle |
Timothy LaRock Timothy Sakharov Sahely Bhadra Tina Eliassi-Rad Understanding the limitations of network online learning Applied Network Science Partially observed networks Online learning Heavy-tailed target distributions |
author_facet |
Timothy LaRock Timothy Sakharov Sahely Bhadra Tina Eliassi-Rad |
author_sort |
Timothy LaRock |
title |
Understanding the limitations of network online learning |
title_short |
Understanding the limitations of network online learning |
title_full |
Understanding the limitations of network online learning |
title_fullStr |
Understanding the limitations of network online learning |
title_full_unstemmed |
Understanding the limitations of network online learning |
title_sort |
understanding the limitations of network online learning |
publisher |
SpringerOpen |
series |
Applied Network Science |
issn |
2364-8228 |
publishDate |
2020-09-01 |
description |
Abstract Studies of networked phenomena, such as interactions in online social media, often rely on incomplete data, either because these phenomena are partially observed, or because the data is too large or expensive to acquire all at once. Analysis of incomplete data leads to skewed or misleading results. In this paper, we investigate limitations of learning to complete partially observed networks via node querying. Concretely, we study the following problem: given (i) a partially observed network, (ii) the ability to query nodes for their connections (e.g., by accessing an API), and (iii) a budget on the number of such queries, sequentially learn which nodes to query in order to maximally increase observability. We call this querying process Network Online Learning and present a family of algorithms called NOL*. These algorithms learn to choose which partially observed node to query next based on a parameterized model that is trained online through a process of exploration and exploitation. Extensive experiments on both synthetic and real world networks show that (i) it is possible to sequentially learn to choose which nodes are best to query in a network and (ii) some macroscopic properties of networks, such as the degree distribution and modular structure, impact the potential for learning and the optimal amount of random exploration. |
topic |
Partially observed networks Online learning Heavy-tailed target distributions |
url |
http://link.springer.com/article/10.1007/s41109-020-00296-w |
work_keys_str_mv |
AT timothylarock understandingthelimitationsofnetworkonlinelearning AT timothysakharov understandingthelimitationsofnetworkonlinelearning AT sahelybhadra understandingthelimitationsofnetworkonlinelearning AT tinaeliassirad understandingthelimitationsofnetworkonlinelearning |
_version_ |
1724743176165523456 |