Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Semi-supervised clustering methods ,which aim to cluster the data set under the guidance of some supervisory information, have become a topic of significant research. The supervisory information is usually used as the constraints to bias clustering toward a g...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2012
|
Online Access: | http://ndltd.ncl.edu.tw/handle/30554527577087405135 |
id |
ndltd-TW-100NCTU5394108 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-100NCTU53941082016-03-28T04:20:38Z http://ndltd.ncl.edu.tw/handle/30554527577087405135 Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization 基於Constrained-Nonnegative Matrix Factorization之半監督式分群法 Li, Hsuan-Hsun 李炫勳 碩士 國立交通大學 資訊科學與工程研究所 100 Semi-supervised clustering methods ,which aim to cluster the data set under the guidance of some supervisory information, have become a topic of significant research. The supervisory information is usually used as the constraints to bias clustering toward a good region of search space. In this paper, we propose a semi-supervised algorithm, Constrained-Nonnegative Matrix Factorization, with a small amount of labeled data as constraints to cluster data. The proposed algorithm is a matrix factorization algorithm. Intuitively a good initial point can speed up clustering convergence and may lead to a better local optimized solution. As the result, we devise an algorithm called Constrained-Fuzzy Cmeans algorithm to obtain initial point. The evaluation function is a key element to evaluate the solution calculated by Constrained-Nonnegative Matrix Factorization, so we have some discussions about the evaluation of Constrained-Nonnegative Matrix Factorization. Finally we conduct experiments on several data sets including CiteUlike, Classic3, 20Newgroups and Reuters, and compare with other semi-supervised learning algorithms. The experimental result indicate that the method we proposed can effectively improve clustering performance. Lee, Chia-Hoang 李嘉晃 2012 學位論文 ; thesis 39 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Semi-supervised clustering methods ,which aim to cluster the data set under the guidance of some supervisory information, have become a topic of significant research. The supervisory information is usually used as the constraints to bias clustering toward a good region of search space. In this paper, we propose a semi-supervised algorithm, Constrained-Nonnegative Matrix Factorization, with a small amount of labeled data as constraints to cluster data. The proposed algorithm is a matrix factorization algorithm. Intuitively a good initial point can speed up clustering convergence and may lead to a better local optimized solution. As the result, we devise an algorithm called Constrained-Fuzzy Cmeans algorithm to obtain initial point. The evaluation function is a key element to evaluate the solution calculated by Constrained-Nonnegative Matrix Factorization, so we have some discussions about the evaluation of Constrained-Nonnegative Matrix Factorization. Finally we conduct experiments on several data sets including CiteUlike, Classic3, 20Newgroups and Reuters, and compare with other semi-supervised learning algorithms. The experimental result indicate that the method we proposed can effectively improve clustering performance.
|
author2 |
Lee, Chia-Hoang |
author_facet |
Lee, Chia-Hoang Li, Hsuan-Hsun 李炫勳 |
author |
Li, Hsuan-Hsun 李炫勳 |
spellingShingle |
Li, Hsuan-Hsun 李炫勳 Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
author_sort |
Li, Hsuan-Hsun |
title |
Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
title_short |
Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
title_full |
Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
title_fullStr |
Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
title_full_unstemmed |
Clustering with Labeled and Unlabeled Data Based on Constrained -Nonnegative Matrix Factorization |
title_sort |
clustering with labeled and unlabeled data based on constrained -nonnegative matrix factorization |
publishDate |
2012 |
url |
http://ndltd.ncl.edu.tw/handle/30554527577087405135 |
work_keys_str_mv |
AT lihsuanhsun clusteringwithlabeledandunlabeleddatabasedonconstrainednonnegativematrixfactorization AT lǐxuànxūn clusteringwithlabeledandunlabeleddatabasedonconstrainednonnegativematrixfactorization AT lihsuanhsun jīyúconstrainednonnegativematrixfactorizationzhībànjiāndūshìfēnqúnfǎ AT lǐxuànxūn jīyúconstrainednonnegativematrixfactorizationzhībànjiāndūshìfēnqúnfǎ |
_version_ |
1718213422733066240 |