Text Mining with Semi-Supervised Learning

博士 === 國立交通大學 === 資訊科學與工程研究所 === 103 === As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to th...

Full description

Bibliographic Details
Main Authors: Hsaio, Wen-Hoar, 蕭文豪
Other Authors: Lee, Chia-Hoang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/40666046315556759662
id ndltd-TW-103NCTU5394091
record_format oai_dc
spelling ndltd-TW-103NCTU53940912016-07-02T04:29:15Z http://ndltd.ncl.edu.tw/handle/40666046315556759662 Text Mining with Semi-Supervised Learning 運用半監督式學習法之文件探勘研究 Hsaio, Wen-Hoar 蕭文豪 博士 國立交通大學 資訊科學與工程研究所 103 As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to the people. How to effectively and automatically organize these articles or documents has been an attractive research field for the machine learning community. Semi-supervised learning, learning from a combination of both labeled and unlabeled data, is a machine learning approach between unsupervised learning and supervised learning. It has recently became an active research area in machine learning and received a lot of attention over the last decade. Besides, sparse representations have proven to be an extremely powerful tool for acquiring, representing, and compressing high-dimensional objects in signal processing and computer vision. Moreover, learning with Universum, which uses the examples with different distributions to the target ones to estimate prior model information, is a popular research subject in machine learning. This thesis focuses on text mining with semi-supervised learning to propose four semi-supervised learning algorithms, which are Constrained-PLSA, SSS-MF, Semi-LDC and ԱSemi-AdaBoost.MH. This thesis conducts experiments on four famous real data sets and uses several state-of-the-art semi-supervised learning algorithms to compare with the proposed algorithms. The experimental results indicate that the proposed method generally outperforms the other compared semi-supervised learning methods on given data sets. Lee, Chia-Hoang Liu, Chien-Liang 李嘉晃 劉建良 2015 學位論文 ; thesis 119 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立交通大學 === 資訊科學與工程研究所 === 103 === As the Internet grows, many overwhelming information sources, including the documents and blog articles, are available on the web. These information sources comprise a lot of semantic information, since they are originally created to deliver information to the people. How to effectively and automatically organize these articles or documents has been an attractive research field for the machine learning community. Semi-supervised learning, learning from a combination of both labeled and unlabeled data, is a machine learning approach between unsupervised learning and supervised learning. It has recently became an active research area in machine learning and received a lot of attention over the last decade. Besides, sparse representations have proven to be an extremely powerful tool for acquiring, representing, and compressing high-dimensional objects in signal processing and computer vision. Moreover, learning with Universum, which uses the examples with different distributions to the target ones to estimate prior model information, is a popular research subject in machine learning. This thesis focuses on text mining with semi-supervised learning to propose four semi-supervised learning algorithms, which are Constrained-PLSA, SSS-MF, Semi-LDC and ԱSemi-AdaBoost.MH. This thesis conducts experiments on four famous real data sets and uses several state-of-the-art semi-supervised learning algorithms to compare with the proposed algorithms. The experimental results indicate that the proposed method generally outperforms the other compared semi-supervised learning methods on given data sets.
author2 Lee, Chia-Hoang
author_facet Lee, Chia-Hoang
Hsaio, Wen-Hoar
蕭文豪
author Hsaio, Wen-Hoar
蕭文豪
spellingShingle Hsaio, Wen-Hoar
蕭文豪
Text Mining with Semi-Supervised Learning
author_sort Hsaio, Wen-Hoar
title Text Mining with Semi-Supervised Learning
title_short Text Mining with Semi-Supervised Learning
title_full Text Mining with Semi-Supervised Learning
title_fullStr Text Mining with Semi-Supervised Learning
title_full_unstemmed Text Mining with Semi-Supervised Learning
title_sort text mining with semi-supervised learning
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/40666046315556759662
work_keys_str_mv AT hsaiowenhoar textminingwithsemisupervisedlearning
AT xiāowénháo textminingwithsemisupervisedlearning
AT hsaiowenhoar yùnyòngbànjiāndūshìxuéxífǎzhīwénjiàntànkānyánjiū
AT xiāowénháo yùnyòngbànjiāndūshìxuéxífǎzhīwénjiàntànkānyánjiū
_version_ 1718333022298701824