Summary: | 碩士 === 中原大學 === 資訊工程學系 === 88 === Many information-correlated systems, such as World-Wide Web and Bulletin Board System, are composed of a series of documents and their inter-connected hyperlinks, and then they are often called Distributed Information-Providing Environment (DIPE). For representing DIPE, a special kind of graph, called information graph (IG), is proposed in this thesis, and the browsing behavior of DIPE is regarded as traversal sequences on such graph.
Based on decision tree and rough set techniques, this thesis proposes an approach to classify traversal sequences in an information graph of Internet environment. This approach designs a method to transform the original information graph into a simpler one, as well as proposes vertex similarity and sequence similarity to compare two traversal sequences in an information graph. Subsequently, time factors are introduced in the analysis, in which some general sequence patterns can be detected to group time traversal sequences. As these analyzed results are processed by data mining techniques, it is easy to make some rule discovery, even prediction in a variety of DIPE.
An experimental system is implemented to prove our idea. The demonstration system takes a real web site to evaluate and compare the calculation results. Examples show that the proposed methods can be integrated and work successfully.
|