A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks

As online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulatio...

Full description

Bibliographic Details
Main Authors: Bin Chen, Hailiang Chen, Dandan Ning, Mengna Zhu, Chuan Ai, Xiaogang Qiu, Weihui Dai
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/12/5/843
id doaj-67616a9e492b4aceb9b63ea217105880
record_format Article
spelling doaj-67616a9e492b4aceb9b63ea2171058802020-11-25T03:48:47ZengMDPI AGSymmetry2073-89942020-05-011284384310.3390/sym12050843A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social NetworksBin Chen0Hailiang Chen1Dandan Ning2Mengna Zhu3Chuan Ai4Xiaogang Qiu5Weihui Dai6College of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaCollege of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaCollege of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaCollege of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaCollege of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaCollege of Systems Engineering, National University of Defense Technology, Changsha 410073, ChinaDepartment of Information Management and Information Systems, School of Management, Fudan University, Shanghai 200433, ChinaAs online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulation world by agent-based modeling and simulation (ABMS), which is considered an effective solution by scholars from computational sociology. However, on the one hand, classical ABMS tools such as NetLogo cannot support the simulation of more than thousands of agents. On the other hand, big data platforms such as Hadoop and Spark used to study big datasets do not provide optimization for the simulation of large-scale social networks. A two-tier partition algorithm for the optimization of large-scale simulation of social networks is proposed in this paper. First, the simulation kernel of ABMS for information diffusion is implemented based on the Spark platform. Both the data structure and the scheduling mechanism are implemented by Resilient Distributed Data (RDD) to simulate the millions of agents. Second, a two-tier partition algorithm is implemented by community detection and graph cut. Community detection is used to find the partition of high interactions in the social network. A graph cut is used to achieve the goal of load balance. Finally, with the support of the dataset recorded from Twitter, a series of experiments are used to testify the performance of the two-tier partition algorithm in both the communication cost and load balance.https://www.mdpi.com/2073-8994/12/5/843social network simulationABMSSparktwo-tier partition algorithm
collection DOAJ
language English
format Article
sources DOAJ
author Bin Chen
Hailiang Chen
Dandan Ning
Mengna Zhu
Chuan Ai
Xiaogang Qiu
Weihui Dai
spellingShingle Bin Chen
Hailiang Chen
Dandan Ning
Mengna Zhu
Chuan Ai
Xiaogang Qiu
Weihui Dai
A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
Symmetry
social network simulation
ABMS
Spark
two-tier partition algorithm
author_facet Bin Chen
Hailiang Chen
Dandan Ning
Mengna Zhu
Chuan Ai
Xiaogang Qiu
Weihui Dai
author_sort Bin Chen
title A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
title_short A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
title_full A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
title_fullStr A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
title_full_unstemmed A Two-Tier Partition Algorithm for the Optimization of the Large-scale Simulation of Information Diffusion in Social Networks
title_sort two-tier partition algorithm for the optimization of the large-scale simulation of information diffusion in social networks
publisher MDPI AG
series Symmetry
issn 2073-8994
publishDate 2020-05-01
description As online social networks play a more and more important role in public opinion, the large-scale simulation of social networks has been focused on by many scientists from sociology, communication, informatics, and so on. It is a good way to study real information diffusion in a symmetrical simulation world by agent-based modeling and simulation (ABMS), which is considered an effective solution by scholars from computational sociology. However, on the one hand, classical ABMS tools such as NetLogo cannot support the simulation of more than thousands of agents. On the other hand, big data platforms such as Hadoop and Spark used to study big datasets do not provide optimization for the simulation of large-scale social networks. A two-tier partition algorithm for the optimization of large-scale simulation of social networks is proposed in this paper. First, the simulation kernel of ABMS for information diffusion is implemented based on the Spark platform. Both the data structure and the scheduling mechanism are implemented by Resilient Distributed Data (RDD) to simulate the millions of agents. Second, a two-tier partition algorithm is implemented by community detection and graph cut. Community detection is used to find the partition of high interactions in the social network. A graph cut is used to achieve the goal of load balance. Finally, with the support of the dataset recorded from Twitter, a series of experiments are used to testify the performance of the two-tier partition algorithm in both the communication cost and load balance.
topic social network simulation
ABMS
Spark
two-tier partition algorithm
url https://www.mdpi.com/2073-8994/12/5/843
work_keys_str_mv AT binchen atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT hailiangchen atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT dandanning atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT mengnazhu atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT chuanai atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT xiaogangqiu atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT weihuidai atwotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT binchen twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT hailiangchen twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT dandanning twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT mengnazhu twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT chuanai twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT xiaogangqiu twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
AT weihuidai twotierpartitionalgorithmfortheoptimizationofthelargescalesimulationofinformationdiffusioninsocialnetworks
_version_ 1724497090340454400