Dictionary-based news category classification : using sports news as example
碩士 === 淡江大學 === 資訊工程學系碩士班 === 104 === Rapid and vigorous development of information network technology has resulted in the largest data repository. Collecting relevant information in such a large body of data is rather difficult for any user. This paper is aimed to help users to grasp key informati...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/f6g5au |
id |
ndltd-TW-104TKU05392010 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104TKU053920102019-05-15T23:01:41Z http://ndltd.ncl.edu.tw/handle/f6g5au Dictionary-based news category classification : using sports news as example 用字典為基礎判別新聞事件類型:以體育新聞為例 Yen-Lung Lee 李儼倫 碩士 淡江大學 資訊工程學系碩士班 104 Rapid and vigorous development of information network technology has resulted in the largest data repository. Collecting relevant information in such a large body of data is rather difficult for any user. This paper is aimed to help users to grasp key information in a short period of time. We observe that term frequency in a article can be used as keyword for that article. Article theme can be easily grasped based on these keywords. Therefore, users can find the information they want through keyword and significantly reduce unnecessary search time. Proper word segmentation enables article theme extraction. And article classification can be achieved by theme differentiation. We use 320 articles in the theme classification experiment. These articles are divided into two categories: training and testing. There are 285 training samples, all belonging to the sports news theme. There are 15 testing samples that are consists of themes picked at random. The result is able to pick out 6 articles which belonging to sport news theme among the 15 testing samples. Among the 20 negative samples, there are 4 false positives, all due to names related to sports events. Yi-Hjia Tsai 蔡憶佳 2016 學位論文 ; thesis 45 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 淡江大學 === 資訊工程學系碩士班 === 104 === Rapid and vigorous development of information network technology has resulted in the largest data repository.
Collecting relevant information in such a large body of data is rather difficult for any user.
This paper is aimed to help users to grasp key information in a short period of time.
We observe that term frequency in a article can be used as keyword for that article.
Article theme can be easily grasped based on these keywords.
Therefore, users can find the information they want through keyword and significantly reduce unnecessary search time.
Proper word segmentation enables article theme extraction.
And article classification can be achieved by theme differentiation.
We use 320 articles in the theme classification experiment. These articles are divided into two categories: training and testing.
There are 285 training samples,
all belonging to the sports news theme.
There are 15 testing samples that are consists of themes picked at random.
The result is able to pick out 6 articles which belonging to sport news theme among the 15 testing samples.
Among the 20 negative samples, there are 4 false positives, all due to names related to sports events.
|
author2 |
Yi-Hjia Tsai |
author_facet |
Yi-Hjia Tsai Yen-Lung Lee 李儼倫 |
author |
Yen-Lung Lee 李儼倫 |
spellingShingle |
Yen-Lung Lee 李儼倫 Dictionary-based news category classification : using sports news as example |
author_sort |
Yen-Lung Lee |
title |
Dictionary-based news category classification : using sports news as example |
title_short |
Dictionary-based news category classification : using sports news as example |
title_full |
Dictionary-based news category classification : using sports news as example |
title_fullStr |
Dictionary-based news category classification : using sports news as example |
title_full_unstemmed |
Dictionary-based news category classification : using sports news as example |
title_sort |
dictionary-based news category classification : using sports news as example |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/f6g5au |
work_keys_str_mv |
AT yenlunglee dictionarybasednewscategoryclassificationusingsportsnewsasexample AT lǐyǎnlún dictionarybasednewscategoryclassificationusingsportsnewsasexample AT yenlunglee yòngzìdiǎnwèijīchǔpànbiéxīnwénshìjiànlèixíngyǐtǐyùxīnwénwèilì AT lǐyǎnlún yòngzìdiǎnwèijīchǔpànbiéxīnwénshìjiànlèixíngyǐtǐyùxīnwénwèilì |
_version_ |
1719140164100423680 |