Summary: | 碩士 === 淡江大學 === 資訊工程學系碩士在職專班 === 98 === Pushed by the increasing advancement of computer technology and the rapid growth of the Internet, digital information has been produced on a mass scale. Internet network has become a huge information source and provided rich and valuable resource. Every Web site is like a data source, and these sources can be seen as a database in general sense, even large and more complex than the database in conventional sense. Via website links, these Web sites with different contents and organizations constitute a large heterogeneous database environment.
Without the help of efficient search engines, finding the wanted information from the current World Wide Web will be as difficult as looking for a needle in the haystack. Today there are many commercial search engines to meet such needs, for instance: Google, Yahoo, Ask and Microsoft Live Search, and so on. Search engines usually will rate and list the searched results according to their relevancy for users to browse and choose the summary contents of the searched results. Such a browsing mode is extremely inefficient, since the quantity of web search results is usually quite huge and most general users only browse a number of searched results listed in the beginning. Besides, this kind of rating and listing would make a lot of sub-topics searched mixed up with the wanted ones. This also tends to cause users to miss important information. In addition, in the process of retrieval, many users usually do not keep conducting keyword searches but instead spending more time browsing the searched results.
However, a major problem is that the search engines using the search mechanism of web contents and hyperlink mode can only reflect web authors’ views but not readers’. In this paper, we based on users’ browsing web contents to develop web user clustering mining technology. And according to our experimental results, users can classify the web contents (of the websites browsed by them) and apply those contents to the web recommendation through the websites browsed by them.
|