Text Mining with Semi-Supervised Learning
博士 === 國立交通大學 === 資訊科學與工程研究所 === 103 === As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to th...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/40666046315556759662 |
id |
ndltd-TW-103NCTU5394091 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NCTU53940912016-07-02T04:29:15Z http://ndltd.ncl.edu.tw/handle/40666046315556759662 Text Mining with Semi-Supervised Learning 運用半監督式學習法之文件探勘研究 Hsaio, Wen-Hoar 蕭文豪 博士 國立交通大學 資訊科學與工程研究所 103 As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to the people. How to effectively and automatically organize these articles or documents has been an attractive research field for the machine learning community. Semi-supervised learning, learning from a combination of both labeled and unlabeled data, is a machine learning approach between unsupervised learning and supervised learning. It has recently became an active research area in machine learning and received a lot of attention over the last decade. Besides, sparse representations have proven to be an extremely powerful tool for acquiring, representing, and compressing high-dimensional objects in signal processing and computer vision. Moreover, learning with Universum, which uses the examples with different distributions to the target ones to estimate prior model information, is a popular research subject in machine learning. This thesis focuses on text mining with semi-supervised learning to propose four semi-supervised learning algorithms, which are Constrained-PLSA, SSS-MF, Semi-LDC and ԱSemi-AdaBoost.MH. This thesis conducts experiments on four famous real data sets and uses several state-of-the-art semi-supervised learning algorithms to compare with the proposed algorithms. The experimental results indicate that the proposed method generally outperforms the other compared semi-supervised learning methods on given data sets. Lee, Chia-Hoang Liu, Chien-Liang 李嘉晃 劉建良 2015 學位論文 ; thesis 119 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立交通大學 === 資訊科學與工程研究所 === 103 === As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to the people. How to effectively and automatically organize these articles or documents has been an attractive research field for the machine learning community. Semi-supervised learning, learning from a combination of both labeled and unlabeled data, is a machine learning approach between unsupervised learning and supervised learning. It has recently became an active research area in machine learning and received a lot of attention over the last decade. Besides, sparse representations have proven to be an extremely powerful tool for acquiring, representing, and compressing high-dimensional objects in signal processing and computer vision. Moreover, learning with Universum, which uses the examples with different distributions to the target ones to estimate prior model information, is a popular research subject in machine learning. This thesis focuses on text mining with semi-supervised learning to propose four semi-supervised learning algorithms, which are Constrained-PLSA, SSS-MF, Semi-LDC and ԱSemi-AdaBoost.MH. This thesis conducts experiments on four famous real data sets and uses several state-of-the-art semi-supervised learning algorithms to compare with the proposed algorithms. The experimental results indicate that the proposed method generally outperforms the other compared semi-supervised learning methods on given data sets.
|
author2 |
Lee, Chia-Hoang |
author_facet |
Lee, Chia-Hoang Hsaio, Wen-Hoar 蕭文豪 |
author |
Hsaio, Wen-Hoar 蕭文豪 |
spellingShingle |
Hsaio, Wen-Hoar 蕭文豪 Text Mining with Semi-Supervised Learning |
author_sort |
Hsaio, Wen-Hoar |
title |
Text Mining with Semi-Supervised Learning |
title_short |
Text Mining with Semi-Supervised Learning |
title_full |
Text Mining with Semi-Supervised Learning |
title_fullStr |
Text Mining with Semi-Supervised Learning |
title_full_unstemmed |
Text Mining with Semi-Supervised Learning |
title_sort |
text mining with semi-supervised learning |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/40666046315556759662 |
work_keys_str_mv |
AT hsaiowenhoar textminingwithsemisupervisedlearning AT xiāowénháo textminingwithsemisupervisedlearning AT hsaiowenhoar yùnyòngbànjiāndūshìxuéxífǎzhīwénjiàntànkānyánjiū AT xiāowénháo yùnyòngbànjiāndūshìxuéxífǎzhīwénjiàntànkānyánjiū |
_version_ |
1718333022298701824 |