A Feature Selection Method Based on Text Segmentation of E-Books

碩士 === 國立成功大學 === 資訊管理研究所 === 98 === With the exponential growth of information technology and Internet, paper books can be transformed into e-books. People can get these e-books form Internet and download them by e-reader. It enhances the convenience to absorb knowledge from books. However, the num...

Full description

Bibliographic Details
Main Authors:	Ming-WeiLai, 賴銘偉
Other Authors:	Hei-Chia Wang
Format:	Others
Language:	zh-TW
Published:	2010
Online Access:	http://ndltd.ncl.edu.tw/handle/02023913522965767774

id	ndltd-TW-098NCKU5396018
record_format	oai_dc
spelling	ndltd-TW-098NCKU53960182015-11-06T04:03:45Z http://ndltd.ncl.edu.tw/handle/02023913522965767774 A Feature Selection Method Based on Text Segmentation of E-Books 基於文件分段之電子書特徵選取 Ming-WeiLai 賴銘偉碩士國立成功大學資訊管理研究所 98 With the exponential growth of information technology and Internet, paper books can be transformed into e-books. People can get these e-books form Internet and download them by e-reader. It enhances the convenience to absorb knowledge from books. However, the number of e-books has been very large. It costs lot time and energy to classify these e-books. The traditional classification approaches like decision tree, k-nearest neighbor, na?ve bayes, support vector machines, usually select the feature words from content. These words will form a feature space. The longer the article is, the more likely generate a lot of feature words, and the dimension of feature space is higher. It causes the follow-up of the classification process complicated. Therefore, the classification process steps filter undesirable feature words through feature selection. However, the length of e-books is usually much longer than the general article. With traditional approaches, e-books generate a large number of feature words, and cause the follow-up of the classification process complicated, and even lost important feature words because of the long length content reducing these words’ overall weights. Therefore, we present a novel feature selection approach which applies a text segmentation algorithm. With this algorithm, e-books can be cut several segments.We analyze all words’ importance in these segments and select the inportant feature words for every segments. We expect that the feature words which selected by our approach can improve the accuracy of classification. Hei-Chia Wang 王惠嘉 2010 學位論文 ; thesis 90 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 資訊管理研究所 === 98 === With the exponential growth of information technology and Internet, paper books can be transformed into e-books. People can get these e-books form Internet and download them by e-reader. It enhances the convenience to absorb knowledge from books. However, the number of e-books has been very large. It costs lot time and energy to classify these e-books. The traditional classification approaches like decision tree, k-nearest neighbor, na?ve bayes, support vector machines, usually select the feature words from content. These words will form a feature space. The longer the article is, the more likely generate a lot of feature words, and the dimension of feature space is higher. It causes the follow-up of the classification process complicated. Therefore, the classification process steps filter undesirable feature words through feature selection. However, the length of e-books is usually much longer than the general article. With traditional approaches, e-books generate a large number of feature words, and cause the follow-up of the classification process complicated, and even lost important feature words because of the long length content reducing these words’ overall weights. Therefore, we present a novel feature selection approach which applies a text segmentation algorithm. With this algorithm, e-books can be cut several segments.We analyze all words’ importance in these segments and select the inportant feature words for every segments. We expect that the feature words which selected by our approach can improve the accuracy of classification.
author2	Hei-Chia Wang
author_facet	Hei-Chia Wang Ming-WeiLai 賴銘偉
author	Ming-WeiLai 賴銘偉
spellingShingle	Ming-WeiLai 賴銘偉 A Feature Selection Method Based on Text Segmentation of E-Books
author_sort	Ming-WeiLai
title	A Feature Selection Method Based on Text Segmentation of E-Books
title_short	A Feature Selection Method Based on Text Segmentation of E-Books
title_full	A Feature Selection Method Based on Text Segmentation of E-Books
title_fullStr	A Feature Selection Method Based on Text Segmentation of E-Books
title_full_unstemmed	A Feature Selection Method Based on Text Segmentation of E-Books
title_sort	feature selection method based on text segmentation of e-books
publishDate	2010
url	http://ndltd.ncl.edu.tw/handle/02023913522965767774
work_keys_str_mv	AT mingweilai afeatureselectionmethodbasedontextsegmentationofebooks AT làimíngwěi afeatureselectionmethodbasedontextsegmentationofebooks AT mingweilai jīyúwénjiànfēnduànzhīdiànzishūtèzhēngxuǎnqǔ AT làimíngwěi jīyúwénjiànfēnduànzhīdiànzishūtèzhēngxuǎnqǔ AT mingweilai featureselectionmethodbasedontextsegmentationofebooks AT làimíngwěi featureselectionmethodbasedontextsegmentationofebooks
_version_	1718125296995008512

A Feature Selection Method Based on Text Segmentation of E-Books

Similar Items