Assessing Low-Intensity Relationships in Complex Networks.

Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human i...

Full description

Bibliographic Details
Main Authors:	Andreas Spitz, Anna Gimmler, Thorsten Stoeck, Katharina Anna Zweig, Emőke-Ágnes Horvát
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2016-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC4838277?pdf=render

id	doaj-0bcde7d28b4f4ccabc4050974aac2d46
record_format	Article
spelling	doaj-0bcde7d28b4f4ccabc4050974aac2d462020-11-24T21:35:15ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01114e015253610.1371/journal.pone.0152536Assessing Low-Intensity Relationships in Complex Networks.Andreas SpitzAnna GimmlerThorsten StoeckKatharina Anna ZweigEmőke-Ágnes HorvátMany large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes.http://europepmc.org/articles/PMC4838277?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Andreas Spitz Anna Gimmler Thorsten Stoeck Katharina Anna Zweig Emőke-Ágnes Horvát
spellingShingle	Andreas Spitz Anna Gimmler Thorsten Stoeck Katharina Anna Zweig Emőke-Ágnes Horvát Assessing Low-Intensity Relationships in Complex Networks. PLoS ONE
author_facet	Andreas Spitz Anna Gimmler Thorsten Stoeck Katharina Anna Zweig Emőke-Ágnes Horvát
author_sort	Andreas Spitz
title	Assessing Low-Intensity Relationships in Complex Networks.
title_short	Assessing Low-Intensity Relationships in Complex Networks.
title_full	Assessing Low-Intensity Relationships in Complex Networks.
title_fullStr	Assessing Low-Intensity Relationships in Complex Networks.
title_full_unstemmed	Assessing Low-Intensity Relationships in Complex Networks.
title_sort	assessing low-intensity relationships in complex networks.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2016-01-01
description	Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes.
url	http://europepmc.org/articles/PMC4838277?pdf=render
work_keys_str_mv	AT andreasspitz assessinglowintensityrelationshipsincomplexnetworks AT annagimmler assessinglowintensityrelationshipsincomplexnetworks AT thorstenstoeck assessinglowintensityrelationshipsincomplexnetworks AT katharinaannazweig assessinglowintensityrelationshipsincomplexnetworks AT emokeagneshorvat assessinglowintensityrelationshipsincomplexnetworks
_version_	1725945761626062848

Assessing Low-Intensity Relationships in Complex Networks.

Similar Items