Learning to Extract Objects from the World Wide Web

碩士 === 國立中正大學 === 資訊工程研究所 === 92 === The World Wide Web(WWW) has become a vast information store that is growing at a rapid rate. Some site do provide search engines, but their query power are often limited, and the results are a large number of HTML pages. A tool to help user to locate o...

Full description

Bibliographic Details
Main Authors: Yi-Wei Jan, 詹益瑋
Other Authors: Jyh-Jong Tsay
Format: Others
Language:zh-TW
Published: 2004
Online Access:http://ndltd.ncl.edu.tw/handle/42268007252422218195
Description
Summary:碩士 === 國立中正大學 === 資訊工程研究所 === 92 === The World Wide Web(WWW) has become a vast information store that is growing at a rapid rate. Some site do provide search engines, but their query power are often limited, and the results are a large number of HTML pages. A tool to help user to locate or mine objects from these HTML pages should be useful .We have implemented the tool based on Inductive Logic Programming(ILP) system. The input to the tool is some HTML pages with marks that indicate where the data of interest is located on these pages. It then produces an extractor that can extracts objects from the other pages automatically.