Nonparametric sparsification of complex multiscale networks.

Many real-world networks tend to be very dense. Particular examples of interest arise in the construction of networks that represent pairwise similarities between objects. In these cases, the networks under consideration are weighted, generally with positive weights between any two nodes. Visualizat...

Full description

Bibliographic Details
Main Authors: Nicholas J Foti, James M Hughes, Daniel N Rockmore
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-02-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21346815/pdf/?tool=EBI
Description
Summary:Many real-world networks tend to be very dense. Particular examples of interest arise in the construction of networks that represent pairwise similarities between objects. In these cases, the networks under consideration are weighted, generally with positive weights between any two nodes. Visualization and analysis of such networks, especially when the number of nodes is large, can pose significant challenges which are often met by reducing the edge set. Any effective "sparsification" must retain and reflect the important structure in the network. A common method is to simply apply a hard threshold, keeping only those edges whose weight exceeds some predetermined value. A more principled approach is to extract the multiscale "backbone" of a network by retaining statistically significant edges through hypothesis testing on a specific null model, or by appropriately transforming the original weight matrix before applying some sort of threshold. Unfortunately, approaches such as these can fail to capture multiscale structure in which there can be small but locally statistically significant similarity between nodes. In this paper, we introduce a new method for backbone extraction that does not rely on any particular null model, but instead uses the empirical distribution of similarity weight to determine and then retain statistically significant edges. We show that our method adapts to the heterogeneity of local edge weight distributions in several paradigmatic real world networks, and in doing so retains their multiscale structure with relatively insignificant additional computational costs. We anticipate that this simple approach will be of great use in the analysis of massive, highly connected weighted networks.
ISSN:1932-6203