HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems

The advancement of modern technologies has resulted in an explosive growth of complex systems, such as the Internet, biological, social, and various infrastructure networks, which have, in turn, contributed to the rise of massive networks. During the past decade, analyzing and mining of these networ...

Full description

Bibliographic Details
Main Author: Alam, Md Maksudul
Other Authors: Computer Science
Format: Others
Published: Virginia Tech 2016
Subjects:
Online Access:http://hdl.handle.net/10919/73582
id ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-73582
record_format oai_dc
spelling ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-735822020-09-29T05:37:47Z HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems Alam, Md Maksudul Computer Science Marathe, Madhav Vishnu Khan, Md-Abdul Maleq Vullikanti, Anil Kumar S. Heath, Lenwood S. Perumalla, Kalyan Network Science Parallel Algorithms High Performance Computing Random Networks The advancement of modern technologies has resulted in an explosive growth of complex systems, such as the Internet, biological, social, and various infrastructure networks, which have, in turn, contributed to the rise of massive networks. During the past decade, analyzing and mining of these networks has become an emerging research area with many real-world applications. The most relevant problems in this area include: collecting and managing networks, modeling and generating random networks, and developing network mining algorithms. In the era of big data, speed is not an option anymore for the effective analysis of these massive systems, it is an absolute necessity. This motivates the need for parallel algorithms on modern high-performance computing (HPC) systems including multi-core, distributed, and graphics processor units (GPU) based systems. In this dissertation, we present distributed memory parallel algorithms for generating massive random networks and a novel GPU-based algorithm for index searching. This dissertation is divided into two parts. In Part I, we present parallel algorithms for generating massive random networks using several widely-used models. We design and develop a novel parallel algorithm for generating random networks using the preferential-attachment model. This algorithm can generate networks with billions of edges in just a few minutes using a medium-sized computing cluster. We develop another parallel algorithm for generating random networks with a given sequence of expected degrees. We also design a new a time and space efficient algorithmic method to generate random networks with any degree distributions. This method has been applied to generate random networks using other popular network models, such as block two-level Erdos-Renyi and stochastic block models. Parallel algorithms for network generation pose many nontrivial challenges such as dependency on edges, avoiding duplicate edges, and load balancing. We applied novel techniques to deal with these challenges. All of our algorithms scale very well to a large number of processors and provide almost linear speed-up. Dealing with a large number of networks collected from a variety of fields requires efficient management systems such as graph databases. Finding a record in those databases is very critical and typically is the main bottleneck for performance. In Part II of the dissertation, we develop a GPU-based parallel algorithm for index searching. Our algorithm achieves the fastest throughput ever reported in the literature for various benchmarks. Ph. D. 2016-12-07T09:00:45Z 2016-12-07T09:00:45Z 2016-12-06 Dissertation vt_gsexam:9178 http://hdl.handle.net/10919/73582 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf Virginia Tech
collection NDLTD
format Others
sources NDLTD
topic Network Science
Parallel Algorithms
High Performance Computing
Random Networks
spellingShingle Network Science
Parallel Algorithms
High Performance Computing
Random Networks
Alam, Md Maksudul
HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
description The advancement of modern technologies has resulted in an explosive growth of complex systems, such as the Internet, biological, social, and various infrastructure networks, which have, in turn, contributed to the rise of massive networks. During the past decade, analyzing and mining of these networks has become an emerging research area with many real-world applications. The most relevant problems in this area include: collecting and managing networks, modeling and generating random networks, and developing network mining algorithms. In the era of big data, speed is not an option anymore for the effective analysis of these massive systems, it is an absolute necessity. This motivates the need for parallel algorithms on modern high-performance computing (HPC) systems including multi-core, distributed, and graphics processor units (GPU) based systems. In this dissertation, we present distributed memory parallel algorithms for generating massive random networks and a novel GPU-based algorithm for index searching. This dissertation is divided into two parts. In Part I, we present parallel algorithms for generating massive random networks using several widely-used models. We design and develop a novel parallel algorithm for generating random networks using the preferential-attachment model. This algorithm can generate networks with billions of edges in just a few minutes using a medium-sized computing cluster. We develop another parallel algorithm for generating random networks with a given sequence of expected degrees. We also design a new a time and space efficient algorithmic method to generate random networks with any degree distributions. This method has been applied to generate random networks using other popular network models, such as block two-level Erdos-Renyi and stochastic block models. Parallel algorithms for network generation pose many nontrivial challenges such as dependency on edges, avoiding duplicate edges, and load balancing. We applied novel techniques to deal with these challenges. All of our algorithms scale very well to a large number of processors and provide almost linear speed-up. Dealing with a large number of networks collected from a variety of fields requires efficient management systems such as graph databases. Finding a record in those databases is very critical and typically is the main bottleneck for performance. In Part II of the dissertation, we develop a GPU-based parallel algorithm for index searching. Our algorithm achieves the fastest throughput ever reported in the literature for various benchmarks. === Ph. D.
author2 Computer Science
author_facet Computer Science
Alam, Md Maksudul
author Alam, Md Maksudul
author_sort Alam, Md Maksudul
title HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
title_short HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
title_full HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
title_fullStr HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
title_full_unstemmed HPC-based Parallel Algorithms for Generating Random Networks and Some Other Network Analysis Problems
title_sort hpc-based parallel algorithms for generating random networks and some other network analysis problems
publisher Virginia Tech
publishDate 2016
url http://hdl.handle.net/10919/73582
work_keys_str_mv AT alammdmaksudul hpcbasedparallelalgorithmsforgeneratingrandomnetworksandsomeothernetworkanalysisproblems
_version_ 1719344601450414080