The Research of Applying Banyesian Network on Automatic Web Classification-Using Chinese Internet Bookstore as an example

碩士 === 中國文化大學 === 資訊管理研究所 === 92 === The Internet is booming and search engines are intensively used by users to get related information in the past few years. Most search engines analyze and search for key-words that users defined in the web pages. The results from search engines could in-volve m...

Full description

Bibliographic Details
Main Authors: Chia-Chi Yu, 游佳琪
Other Authors: Chong-Yen Lee
Format: Others
Language:zh-TW
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/51984633178257917948
Description
Summary:碩士 === 中國文化大學 === 資訊管理研究所 === 92 === The Internet is booming and search engines are intensively used by users to get related information in the past few years. Most search engines analyze and search for key-words that users defined in the web pages. The results from search engines could in-volve massive irrelevant data to the users since search engines are not intent to recog-nize the characteristics of webs. An efficient web classification method using Bayesian network is proposed in this re-search. A web classification system is also developed to measure the correctness of the method. The system is separated into feature analysis module and classification inference module. The feature analysis module examines meaningful keywords and the relationships among them in the objective web. This information is passed to the knowledge management module to construct a knowledge base. The classification in-ference module uses the knowledge base to determine the type of the objective web. The information gained during the process can also be used to increment the knowledge base with the self learning ability provided by the system. Chinese electronic bookstores are adopted to investigate the analysis capability of the system. The result shows probability is over 65% of these bookstores can reach 84.5% in the 150 bookstores webs. The experiment discovers probability becomes high when number of sample is more and more. The data in knowledge base learn automated and offer to inference.