Semi-Supervised Clustering with Local Perception of User

碩士 === 國立清華大學 === 資訊工程學系 === 103 === Several semi-supervised clustering algorithms have been proposed to create clusters by exploring side information collected from users. The side information mainly has two categories: one is seed indication information based on global cluster situation; the other...

Full description

Bibliographic Details
Main Authors: Gong,Xin -Yang, 鞏新陽
Other Authors: Wu, Shan -Hung
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/37915919264315910589
id ndltd-TW-103NTHU5392010
record_format oai_dc
spelling ndltd-TW-103NTHU53920102016-12-19T04:14:42Z http://ndltd.ncl.edu.tw/handle/37915919264315910589 Semi-Supervised Clustering with Local Perception of User 考量使用者局部制約觀點之半監督式分群演算法 Gong,Xin -Yang 鞏新陽 碩士 國立清華大學 資訊工程學系 103 Several semi-supervised clustering algorithms have been proposed to create clusters by exploring side information collected from users. The side information mainly has two categories: one is seed indication information based on global cluster situation; the other is pairwise link constraint which is relatively local side information. This paper focuses on the latter: local side information. We show in this paper there is still limitation of the current semi-supervised clustering algorithms. The side information that sampling collected from users may cover fewer representative instances, named as sampling bias here, which would mislead current algorithms and give rise to non-ignorable difference between identified clusters and the true clusters perceived by users. To address the limitation, we present a new clustering algorithm, named perception transform analysis (PTA), taking user’s perception words together with traditional side information into account by modeling user’s perception words in the form of perception vectors. This paper focuses on local side information, which means each perception vector models the concepts behind a must-link constraint and can be collected from users together with must-links. To verify the effectiveness of the proposed algorithm, we compare it with the state-of-the-art semi-supervised clustering algorithms. Extensive experiments are conducted on real datasets and the results demonstrate its advantages and robustness to sampling bias. Wu, Shan -Hung 吳尚鴻 2014 學位論文 ; thesis 32 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 資訊工程學系 === 103 === Several semi-supervised clustering algorithms have been proposed to create clusters by exploring side information collected from users. The side information mainly has two categories: one is seed indication information based on global cluster situation; the other is pairwise link constraint which is relatively local side information. This paper focuses on the latter: local side information. We show in this paper there is still limitation of the current semi-supervised clustering algorithms. The side information that sampling collected from users may cover fewer representative instances, named as sampling bias here, which would mislead current algorithms and give rise to non-ignorable difference between identified clusters and the true clusters perceived by users. To address the limitation, we present a new clustering algorithm, named perception transform analysis (PTA), taking user’s perception words together with traditional side information into account by modeling user’s perception words in the form of perception vectors. This paper focuses on local side information, which means each perception vector models the concepts behind a must-link constraint and can be collected from users together with must-links. To verify the effectiveness of the proposed algorithm, we compare it with the state-of-the-art semi-supervised clustering algorithms. Extensive experiments are conducted on real datasets and the results demonstrate its advantages and robustness to sampling bias.
author2 Wu, Shan -Hung
author_facet Wu, Shan -Hung
Gong,Xin -Yang
鞏新陽
author Gong,Xin -Yang
鞏新陽
spellingShingle Gong,Xin -Yang
鞏新陽
Semi-Supervised Clustering with Local Perception of User
author_sort Gong,Xin -Yang
title Semi-Supervised Clustering with Local Perception of User
title_short Semi-Supervised Clustering with Local Perception of User
title_full Semi-Supervised Clustering with Local Perception of User
title_fullStr Semi-Supervised Clustering with Local Perception of User
title_full_unstemmed Semi-Supervised Clustering with Local Perception of User
title_sort semi-supervised clustering with local perception of user
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/37915919264315910589
work_keys_str_mv AT gongxinyang semisupervisedclusteringwithlocalperceptionofuser
AT gǒngxīnyáng semisupervisedclusteringwithlocalperceptionofuser
AT gongxinyang kǎoliàngshǐyòngzhějúbùzhìyuēguāndiǎnzhībànjiāndūshìfēnqúnyǎnsuànfǎ
AT gǒngxīnyáng kǎoliàngshǐyòngzhějúbùzhìyuēguāndiǎnzhībànjiāndūshìfēnqúnyǎnsuànfǎ
_version_ 1718401249592737792