MCE-P:Scalable Maximum Clique Enumeration Using Apache Hama

碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === The maximum clique mining problem for extremely large graphs has been used in many fields, such as social network, bioinformatics and computational chemistry. Recently, some studies in the literature solve the problem using conventional MapReduce algorithms. Nev...

Full description

Bibliographic Details
Main Authors: Chieh-Hsuan Cheng, 鄭捷軒
Other Authors: Tai-Lin Chin
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/00189756932319713552
Description
Summary:碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === The maximum clique mining problem for extremely large graphs has been used in many fields, such as social network, bioinformatics and computational chemistry. Recently, some studies in the literature solve the problem using conventional MapReduce algorithms. Nevertheless, those algorithms just use the parallel architecture of MapReduce processing to partition the graph, but still apply sequential algorithms to find the maximum clique in a subgraph. The problem of mining the maximum clique in a graph is not actually solved in a parallel fashion. This paper proposes an innovative scheme to mine the maximum clique in a huge graph by a parallel technique based on Apache Hama, which is a general bulk synchronous parallel (BSP) computing engine on top of Hadoop. Essentially, every vertex iteratively executes the same procedure, including receiving messages from its neighbors, processing the tasks and sending messages to its neighbors. The vertices in a particular clique will be collected in each iteration until no vertex can be added. The maximum cliques are determined among those cliques at the end. Our experimental results demonstrate that our proposed solution is more efficient and more scalable than the existing MapReduce algorithms.