Community Detection: A Statistical Approach

碩士 === 國立臺灣大學 === 電信工程學研究所 === 106 === The problem of community detection in the Stochastic Block Model SBM is considered. The first half of this thesis is devoted to the community detection problem extended from graphs to hypergraphs. We propose a more general hypergraph generative model termed d-h...

Full description

Bibliographic Details
Main Authors: Chung-Yi Lin, 林宗毅
Other Authors: 王奕翔
Format: Others
Language:en_US
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/2aq2j5
id ndltd-TW-106NTU05435049
record_format oai_dc
spelling ndltd-TW-106NTU054350492019-05-30T03:50:44Z http://ndltd.ncl.edu.tw/handle/2aq2j5 Community Detection: A Statistical Approach 統計觀點下的叢集偵測問題 Chung-Yi Lin 林宗毅 碩士 國立臺灣大學 電信工程學研究所 106 The problem of community detection in the Stochastic Block Model SBM is considered. The first half of this thesis is devoted to the community detection problem extended from graphs to hypergraphs. We propose a more general hypergraph generative model termed d-hSBM, and characterize the asymptotic misclassification ratio in the minimax sense under it. Achievability part is settled first information-theoretically with the Maximum Likelihood Estimator (MLE) under 3-hSBM and then computation-efficientlly with a two-step algorithm for any order d-hSBM. The converse lower bound is set by finding a smaller parameter space which contains the most dominant error events. The second half of this thesis considers the problem of estimating the number of communities itself apart from the clustering task. As an attempt to characterize the fundamental limit in such formulation, we demonstrate that the MLE, which is optimal under a Bayesian perspective, is consistent, whose form might further endorse a sparser connectivity level. In addition, an efficient spectral method EigenGap is proposed along with a theoretical guarantee. Experimental results on both synthetic data and real-world data consolidate our theoretical finding. 王奕翔 2018 學位論文 ; thesis 128 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 電信工程學研究所 === 106 === The problem of community detection in the Stochastic Block Model SBM is considered. The first half of this thesis is devoted to the community detection problem extended from graphs to hypergraphs. We propose a more general hypergraph generative model termed d-hSBM, and characterize the asymptotic misclassification ratio in the minimax sense under it. Achievability part is settled first information-theoretically with the Maximum Likelihood Estimator (MLE) under 3-hSBM and then computation-efficientlly with a two-step algorithm for any order d-hSBM. The converse lower bound is set by finding a smaller parameter space which contains the most dominant error events. The second half of this thesis considers the problem of estimating the number of communities itself apart from the clustering task. As an attempt to characterize the fundamental limit in such formulation, we demonstrate that the MLE, which is optimal under a Bayesian perspective, is consistent, whose form might further endorse a sparser connectivity level. In addition, an efficient spectral method EigenGap is proposed along with a theoretical guarantee. Experimental results on both synthetic data and real-world data consolidate our theoretical finding.
author2 王奕翔
author_facet 王奕翔
Chung-Yi Lin
林宗毅
author Chung-Yi Lin
林宗毅
spellingShingle Chung-Yi Lin
林宗毅
Community Detection: A Statistical Approach
author_sort Chung-Yi Lin
title Community Detection: A Statistical Approach
title_short Community Detection: A Statistical Approach
title_full Community Detection: A Statistical Approach
title_fullStr Community Detection: A Statistical Approach
title_full_unstemmed Community Detection: A Statistical Approach
title_sort community detection: a statistical approach
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/2aq2j5
work_keys_str_mv AT chungyilin communitydetectionastatisticalapproach
AT línzōngyì communitydetectionastatisticalapproach
AT chungyilin tǒngjìguāndiǎnxiàdecóngjízhēncèwèntí
AT línzōngyì tǒngjìguāndiǎnxiàdecóngjízhēncèwèntí
_version_ 1719195346435833856