Improving the Efficiency and Effectiveness of Community Detection via Prior-Induced Equivalent Super-Network

Abstract Due to the importance of community structure in understanding network and a surge of interest aroused on community detectability, how to improve the community identification performance with pairwise prior information becomes a hot topic. However, most existing semi-supervised community det...

Full description

Bibliographic Details
Main Authors: Liang Yang, Di Jin, Dongxiao He, Huazhu Fu, Xiaochun Cao, Francoise Fogelman-Soulie
Format: Article
Language:English
Published: Nature Publishing Group 2017-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-017-00587-w
Description
Summary:Abstract Due to the importance of community structure in understanding network and a surge of interest aroused on community detectability, how to improve the community identification performance with pairwise prior information becomes a hot topic. However, most existing semi-supervised community detection algorithms only focus on improving the accuracy but ignore the impacts of priors on speeding detection. Besides, they always require to tune additional parameters and cannot guarantee pairwise constraints. To address these drawbacks, we propose a general, high-speed, effective and parameter-free semi-supervised community detection framework. By constructing the indivisible super-nodes according to the connected subgraph of the must-link constraints and by forming the weighted super-edge based on network topology and cannot-link constraints, our new framework transforms the original network into an equivalent but much smaller Super-Network. Super-Network perfectly ensures the must-link constraints and effectively encodes cannot-link constraints. Furthermore, the time complexity of super-network construction process is linear in the original network size, which makes it efficient. Meanwhile, since the constructed super-network is much smaller than the original one, any existing community detection algorithm is much faster when using our framework. Besides, the overall process will not introduce any additional parameters, making it more practical.
ISSN:2045-2322