The Study of Overlapping Community Discovery in MapReduce

碩士 === 樹德科技大學 === 資訊工程系碩士班 === 102 === The growing popularity of mobile networks causes increasing user interaction on social media platforms and generates a massive number of virtual communities. Each online community has its own unique features. To reflect the latest trends in large complex networ...

Full description

Bibliographic Details
Main Authors: Wei-Lin Hsu, 許維麟
Other Authors: Yi-Jen Su
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/32716652592720373002
Description
Summary:碩士 === 樹德科技大學 === 資訊工程系碩士班 === 102 === The growing popularity of mobile networks causes increasing user interaction on social media platforms and generates a massive number of virtual communities. Each online community has its own unique features. To reflect the latest trends in large complex network communities, we need an effective community discovery mechanism to respond to the dynamic changes in community compositions. This study proposes a two-phase algorithm for dynamic overlapping community discovery. In the first phase, static overlapping community discovery is implemented. The static approach uses MapReduce-based distributed computing to solve the hardware bottleneck resulting from centralized computing and to improve execution efficiency. First, information of adjacent nodes in the network is collected, and then the TTT algorithm is applied to enumerate all Maximal Cliques. Next, the CPM algorithm is adopted to discover overlapping communities. In the second phase, dynamic overlapping community discovery is used on affected areas. This approach applies MapReduce to dynamically adjust static discovery results, in order to avoid redundant analysis and enhance efficiency. Experimental results demonstrate the validity of this study. The experimental data are derived from YouTube users’ friends. The data for static overlapping community discovery consist of six groups, each of which ranges from 50,000-300,000 relationships. Those for dynamic overlapping community discovery consist of five groups, each ranging from 2,000 to 10,000 relationships, which are in turn dynamically added to the static overlapping communities to generate the final clustering results.