Efficient Mining of Web Traversal Walks with Site Topology

碩士 === 國立政治大學 === 資訊科學學系 === 89 === With progressive expansion in the size and complexity of web site on the World Wide Web, much research has been done on the discovery of useful and interesting Web traversal patterns. Most existing approaches focus on mining of path traversal patter...

Full description

Bibliographic Details
Main Authors: Hua-Fu Li, 李華富
Other Authors: Man-Kwan Shan
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/35035889112605397315
Description
Summary:碩士 === 國立政治大學 === 資訊科學學系 === 89 === With progressive expansion in the size and complexity of web site on the World Wide Web, much research has been done on the discovery of useful and interesting Web traversal patterns. Most existing approaches focus on mining of path traversal patterns or sequential patterns. In this paper, we present a new pattern, Web traversal walks, for mining of Web traversal pattern. A Web traversal walk is the complete trail of a user traversal behavior in a single Web site. Web traversal walk mining is more helpful to understand and predict the behavior of the Web site access patterns. Two efficient algorithms (i.e., AM and PM) are proposed to discover the Web traversal walks. The algorithm PM is used when the size of database is fit in main memory while AM is not. AM is developed based on the Apriori property to discover all the frequent Web traversal walks from Web logs. In the algorithm PM, a tree structure is constructed in memory from Web logs and the frequent Web traversal walks are generated from the tree structure. Experimental results show that the proposed methods perform well in efficiency and scalability.