Summary: | 碩士 === 國立臺灣科技大學 === 資訊工程系 === 104 === The maximum clique mining problem for extremely large graphs has been used in
many fields, such as social network, bioinformatics and computational chemistry. Recently,
some studies in the literature solve the problem using conventional MapReduce
algorithms. Nevertheless, those algorithms just use the parallel architecture of MapReduce
processing to partition the graph, but still apply sequential algorithms to find the
maximum clique in a subgraph. The problem of mining the maximum clique in a graph
is not actually solved in a parallel fashion. This paper proposes an innovative scheme
to mine the maximum clique in a huge graph by a parallel technique based on Apache
Hama, which is a general bulk synchronous parallel (BSP) computing engine on top of
Hadoop. Essentially, every vertex iteratively executes the same procedure, including receiving
messages from its neighbors, processing the tasks and sending messages to its
neighbors. The vertices in a particular clique will be collected in each iteration until no
vertex can be added. The maximum cliques are determined among those cliques at the
end. Our experimental results demonstrate that our proposed solution is more efficient
and more scalable than the existing MapReduce algorithms.
|