The Application of Keywords Extraction

碩士 === 國立政治大學 === 統計學系 === 107 === Text Mining has become one of the popular research areas after the IBM proposed the term Big Data in 2010. Since then many texts are being digitalized and more scholars are devoted in developing quantitative tools for giving texts semantic meaning without the help...

Full description

Bibliographic Details
Main Authors:	Hsu, Cheng-En, 許承恩
Other Authors:	Yue, Ching-Syang
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/uw62va

id	ndltd-TW-107NCCU5337019
record_format	oai_dc
spelling	ndltd-TW-107NCCU53370192019-11-28T05:23:26Z http://ndltd.ncl.edu.tw/handle/uw62va The Application of Keywords Extraction 關鍵詞偵測方法的比較與應用 Hsu, Cheng-En 許承恩碩士國立政治大學統計學系 107 Text Mining has become one of the popular research areas after the IBM proposed the term Big Data in 2010. Since then many texts are being digitalized and more scholars are devoted in developing quantitative tools for giving texts semantic meaning without the help of human experts. This greatly increases the efficiency of reading a hugh amount of texts provided that the texts are properly structurized. The structurization of texts includes quite a few steps, such as keyword extraction and sentiment analysis. The keyword extraction is critical and the keywords can be used to summarize an article and compare two authors’ writing styles. The goal of this study is to propose a new unsupervised method for extracting keywords and compare it to some frequently used methods, including term frequency inverse document frequency (TF-IDF), logistic regression, machine learning models. In the empirical analysis, we considered three modern Chinese texts, one from People’s Daily (514 articles in 1971-1989) and two from New Youth Magazine (volumes 7 and 8 in 1919-1920). The numbers of words in all texts are approximately 400,000 to 600,000. We asked historical scholars to pick up keywords from these three texts and treat them as the true keywords. Then, we applied different keyword extraction methods to these texts and compared their results. We found that the proposed method has the best performance among all supervised methods and it is competitive to the supervised methods. Yue, Ching-Syang Cheng, Wen-Huei 余清祥鄭文惠 2019 學位論文 ; thesis 57 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立政治大學 === 統計學系 === 107 === Text Mining has become one of the popular research areas after the IBM proposed the term Big Data in 2010. Since then many texts are being digitalized and more scholars are devoted in developing quantitative tools for giving texts semantic meaning without the help of human experts. This greatly increases the efficiency of reading a hugh amount of texts provided that the texts are properly structurized. The structurization of texts includes quite a few steps, such as keyword extraction and sentiment analysis. The keyword extraction is critical and the keywords can be used to summarize an article and compare two authors’ writing styles. The goal of this study is to propose a new unsupervised method for extracting keywords and compare it to some frequently used methods, including term frequency inverse document frequency (TF-IDF), logistic regression, machine learning models. In the empirical analysis, we considered three modern Chinese texts, one from People’s Daily (514 articles in 1971-1989) and two from New Youth Magazine (volumes 7 and 8 in 1919-1920). The numbers of words in all texts are approximately 400,000 to 600,000. We asked historical scholars to pick up keywords from these three texts and treat them as the true keywords. Then, we applied different keyword extraction methods to these texts and compared their results. We found that the proposed method has the best performance among all supervised methods and it is competitive to the supervised methods.
author2	Yue, Ching-Syang
author_facet	Yue, Ching-Syang Hsu, Cheng-En 許承恩
author	Hsu, Cheng-En 許承恩
spellingShingle	Hsu, Cheng-En 許承恩 The Application of Keywords Extraction
author_sort	Hsu, Cheng-En
title	The Application of Keywords Extraction
title_short	The Application of Keywords Extraction
title_full	The Application of Keywords Extraction
title_fullStr	The Application of Keywords Extraction
title_full_unstemmed	The Application of Keywords Extraction
title_sort	application of keywords extraction
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/uw62va
work_keys_str_mv	AT hsuchengen theapplicationofkeywordsextraction AT xǔchéngēn theapplicationofkeywordsextraction AT hsuchengen guānjiàncízhēncèfāngfǎdebǐjiàoyǔyīngyòng AT xǔchéngēn guānjiàncízhēncèfāngfǎdebǐjiàoyǔyīngyòng AT hsuchengen applicationofkeywordsextraction AT xǔchéngēn applicationofkeywordsextraction
_version_	1719298409868820480

The Application of Keywords Extraction

Similar Items