Impact of Pattern Dependency on Web Mining Models

碩士 === 亞洲大學 === 資訊多媒體應用學系 === 102 === With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important is...

Full description

Bibliographic Details
Main Authors: Lin, Yung-Chang, 林永昌
Other Authors: Wu, Sheng-Tang
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/29985312664426971911
id ndltd-TW-102THMU0394005
record_format oai_dc
spelling ndltd-TW-102THMU03940052017-01-14T04:15:13Z http://ndltd.ncl.edu.tw/handle/29985312664426971911 Impact of Pattern Dependency on Web Mining Models 樣式依存度問題於網頁探勘之影響研究 Lin, Yung-Chang 林永昌 碩士 亞洲大學 資訊多媒體應用學系 102 With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important issue. It is one of main research works in Knowledge Discovery field as well. Web Mining can be divided into three categories, including Web Content Mining, Web Structure Mining, and Web Usage Mining. Among them, the problem of Web Content Mining is how to accurately and quickly respond to users' needs and to avoid providing too much irrelevant information. As a result, Pattern which is used for describing semantic in a Web Mining system seems to be significantly important. This research work focus on the dependence between patterns and features. We analyze feature selection mechanism for understanding the impact on pattern generation by using Pattern Taxonomy Model. The experimental results show that the ratio method for determining the number of features performs well and also not affected by the varying number of training samples. In addition, among 3 feature selection methods including TFIDF, Information Gain (IG) and Mutual Information (MI), IG is the best method both in two ways of feature number determination mechanisms. It is recommended to use IG as feature selection method in PTM models and use top 70% of features for pattern composition for the best performance. Wu, Sheng-Tang 吳勝堂 2013 學位論文 ; thesis 66 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 亞洲大學 === 資訊多媒體應用學系 === 102 === With the rapid development of World Wide Web and Information Technology, the internet has become an indispensable element in people’s life. Therefore, how to retrieve information from large amounts of data and convert into useful knowledge is a very important issue. It is one of main research works in Knowledge Discovery field as well. Web Mining can be divided into three categories, including Web Content Mining, Web Structure Mining, and Web Usage Mining. Among them, the problem of Web Content Mining is how to accurately and quickly respond to users' needs and to avoid providing too much irrelevant information. As a result, Pattern which is used for describing semantic in a Web Mining system seems to be significantly important. This research work focus on the dependence between patterns and features. We analyze feature selection mechanism for understanding the impact on pattern generation by using Pattern Taxonomy Model. The experimental results show that the ratio method for determining the number of features performs well and also not affected by the varying number of training samples. In addition, among 3 feature selection methods including TFIDF, Information Gain (IG) and Mutual Information (MI), IG is the best method both in two ways of feature number determination mechanisms. It is recommended to use IG as feature selection method in PTM models and use top 70% of features for pattern composition for the best performance.
author2 Wu, Sheng-Tang
author_facet Wu, Sheng-Tang
Lin, Yung-Chang
林永昌
author Lin, Yung-Chang
林永昌
spellingShingle Lin, Yung-Chang
林永昌
Impact of Pattern Dependency on Web Mining Models
author_sort Lin, Yung-Chang
title Impact of Pattern Dependency on Web Mining Models
title_short Impact of Pattern Dependency on Web Mining Models
title_full Impact of Pattern Dependency on Web Mining Models
title_fullStr Impact of Pattern Dependency on Web Mining Models
title_full_unstemmed Impact of Pattern Dependency on Web Mining Models
title_sort impact of pattern dependency on web mining models
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/29985312664426971911
work_keys_str_mv AT linyungchang impactofpatterndependencyonwebminingmodels
AT línyǒngchāng impactofpatterndependencyonwebminingmodels
AT linyungchang yàngshìyīcúndùwèntíyúwǎngyètànkānzhīyǐngxiǎngyánjiū
AT línyǒngchāng yàngshìyīcúndùwèntíyúwǎngyètànkānzhīyǐngxiǎngyánjiū
_version_ 1718408246339829760