Infoflow: A distributed algorithm to detect communities according to the map equation

Formidably sized networks are becoming more and more common, including in social sciences, biology, neuroscience, and the technology space. Many network sizes are expected to challenge the storage capability of a single physical computer. Here, we take two approaches to handle big networks: first, w...

Full description

Bibliographic Details
Main Author: Fung, P.K (Author)
Format: Article
Language:English
Published: MDPI AG 2019
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02088nam a2200169Ia 4500
001 10.3390-bdcc3030042
008 220511s2019 CNT 000 0 und d
020 |a 25042289 (ISSN) 
245 1 0 |a Infoflow: A distributed algorithm to detect communities according to the map equation 
260 0 |b MDPI AG  |c 2019 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/bdcc3030042 
520 3 |a Formidably sized networks are becoming more and more common, including in social sciences, biology, neuroscience, and the technology space. Many network sizes are expected to challenge the storage capability of a single physical computer. Here, we take two approaches to handle big networks: first, we look at how big data technology and distributed computing is an exciting approach to big data storage and processing. Second, most networks can be partitioned or labeled into communities, clusters, or modules, thus capturing the crux of the network while reducing detailed information, through the class of algorithms known as community detection. In this paper, we combine these two approaches, developing a distributed community detection algorithm to handle big networks. In particular, the map equation provides a way to identify network communities according to the information flow between nodes, where InfoMap is a greedy algorithm that uses the map equation. We develop discrete mathematics to adapt InfoMap into a distributed computing framework and then further develop the mathematics for a greedy algorithm, InfoFlow, which has logarithmic time complexity, compared to the linear complexity in InfoMap. Benchmark results of graphs up to millions of nodes and hundreds of millions of edges confirm the time complexity improvement, while maintaining community accuracy. Thus, we develop a map equation based community detection algorithm suitable for big network data processing. © 2019 by the author. Licensee MDPI, Basel, Switzerland. 
650 0 4 |a Big data 
650 0 4 |a Community detection 
650 0 4 |a Graph 
700 1 |a Fung, P.K.  |e author 
773 |t Big Data and Cognitive Computing