Impact of Pattern Dependency on Web Mining Models
碩士 === 亞洲大學 === 資訊多媒體應用學系 === 102 === With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important is...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/29985312664426971911 |
id |
ndltd-TW-102THMU0394005 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102THMU03940052017-01-14T04:15:13Z http://ndltd.ncl.edu.tw/handle/29985312664426971911 Impact of Pattern Dependency on Web Mining Models 樣式依存度問題於網頁探勘之影響研究 Lin, Yung-Chang 林永昌 碩士 亞洲大學 資訊多媒體應用學系 102 With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important issue. It is one of main research works in Knowledge Discovery field as well. Web Mining can be divided into three categories, including Web Content Mining, Web Structure Mining, and Web Usage Mining. Among them, the problem of Web Content Mining is how to accurately and quickly respond to users' needs and to avoid providing too much irrelevant information. As a result, Pattern which is used for describing semantic in a Web Mining system seems to be significantly important. This research work focus on the dependence between patterns and features. We analyze feature selection mechanism for understanding the impact on pattern generation by using Pattern Taxonomy Model. The experimental results show that the ratio method for determining the number of features performs well and also not affected by the varying number of training samples. In addition, among 3 feature selection methods including TFIDF, Information Gain (IG) and Mutual Information (MI), IG is the best method both in two ways of feature number determination mechanisms. It is recommended to use IG as feature selection method in PTM models and use top 70% of features for pattern composition for the best performance. Wu, Sheng-Tang 吳勝堂 2013 學位論文 ; thesis 66 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 亞洲大學 === 資訊多媒體應用學系 === 102 === With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important issue. It is one of main research works in Knowledge Discovery field as well. Web Mining can be divided into three categories, including Web Content Mining, Web Structure Mining, and Web Usage Mining. Among them, the problem of Web Content Mining is how to accurately and quickly respond to users' needs and to avoid providing too much irrelevant information. As a result, Pattern which is used for describing semantic in a Web Mining system seems to be significantly important. This research work focus on the dependence between patterns and features. We analyze feature selection mechanism for understanding the impact on pattern generation by using Pattern Taxonomy Model. The experimental results show that the ratio method for determining the number of features performs well and also not affected by the varying number of training samples. In addition, among 3 feature selection methods including TFIDF, Information Gain (IG) and Mutual Information (MI), IG is the best method both in two ways of feature number determination mechanisms. It is recommended to use IG as feature selection method in PTM models and use top 70% of features for pattern composition for the best performance.
|
author2 |
Wu, Sheng-Tang |
author_facet |
Wu, Sheng-Tang Lin, Yung-Chang 林永昌 |
author |
Lin, Yung-Chang 林永昌 |
spellingShingle |
Lin, Yung-Chang 林永昌 Impact of Pattern Dependency on Web Mining Models |
author_sort |
Lin, Yung-Chang |
title |
Impact of Pattern Dependency on Web Mining Models |
title_short |
Impact of Pattern Dependency on Web Mining Models |
title_full |
Impact of Pattern Dependency on Web Mining Models |
title_fullStr |
Impact of Pattern Dependency on Web Mining Models |
title_full_unstemmed |
Impact of Pattern Dependency on Web Mining Models |
title_sort |
impact of pattern dependency on web mining models |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/29985312664426971911 |
work_keys_str_mv |
AT linyungchang impactofpatterndependencyonwebminingmodels AT línyǒngchāng impactofpatterndependencyonwebminingmodels AT linyungchang yàngshìyīcúndùwèntíyúwǎngyètànkānzhīyǐngxiǎngyánjiū AT línyǒngchāng yàngshìyīcúndùwèntíyúwǎngyètànkānzhīyǐngxiǎngyánjiū |
_version_ |
1718408246339829760 |