Block-level Ranking for Intra-Website Pages

碩士 === 國立交通大學 === 資訊學院碩士在職專班資訊組 === 95 === According to the statistical data, there are more than 14 billion web pages in whole world by June of 2007. It’s a important thing that how to use this huge database efficiently. For the information that we do not know its location, we usually use search en...

Full description

Bibliographic Details
Main Authors:	Wen-Feng Yao, 姚文鋒
Other Authors:	I-Chen Wu
Format:	Others
Language:	zh-TW
Published:	2007
Online Access:	http://ndltd.ncl.edu.tw/handle/2tvrgw

id	ndltd-TW-095NCTU5392009
record_format	oai_dc
spelling	ndltd-TW-095NCTU53920092019-05-15T19:48:25Z http://ndltd.ncl.edu.tw/handle/2tvrgw Block-level Ranking for Intra-Website Pages 網站內網頁之區塊等級分析 Wen-Feng Yao 姚文鋒碩士國立交通大學資訊學院碩士在職專班資訊組 95 According to the statistical data, there are more than 14 billion web pages in whole world by June of 2007. It’s a important thing that how to use this huge database efficiently. For the information that we do not know its location, we usually use search engines to help us to find it out. And for the information that we do know where it is, we use data extraction to increase the efficiency. BODE (Browser Oriented Data Extraction), developed by our laboratory, is such a web data extraction system. Its GUI can be used to indicate the data they want to retrieve, and the system will generate the BODE script that is used in the extraction process, and then start to extract. However, people must have the basic knowledge about the syntax of BODE script, XPath and HTML Tag to build the BODE script. To reduce the threshold of using BODE system, this thesis proposes an algorithm to distinguish the useful information blocks from a single web site, so as to accomplish the goal of automatically generating BODE script. I-Chen Wu 吳毅成 2007 學位論文 ; thesis 37 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立交通大學 === 資訊學院碩士在職專班資訊組 === 95 === According to the statistical data, there are more than 14 billion web pages in whole world by June of 2007. It’s a important thing that how to use this huge database efficiently. For the information that we do not know its location, we usually use search engines to help us to find it out. And for the information that we do know where it is, we use data extraction to increase the efficiency. BODE (Browser Oriented Data Extraction), developed by our laboratory, is such a web data extraction system. Its GUI can be used to indicate the data they want to retrieve, and the system will generate the BODE script that is used in the extraction process, and then start to extract. However, people must have the basic knowledge about the syntax of BODE script, XPath and HTML Tag to build the BODE script. To reduce the threshold of using BODE system, this thesis proposes an algorithm to distinguish the useful information blocks from a single web site, so as to accomplish the goal of automatically generating BODE script.
author2	I-Chen Wu
author_facet	I-Chen Wu Wen-Feng Yao 姚文鋒
author	Wen-Feng Yao 姚文鋒
spellingShingle	Wen-Feng Yao 姚文鋒 Block-level Ranking for Intra-Website Pages
author_sort	Wen-Feng Yao
title	Block-level Ranking for Intra-Website Pages
title_short	Block-level Ranking for Intra-Website Pages
title_full	Block-level Ranking for Intra-Website Pages
title_fullStr	Block-level Ranking for Intra-Website Pages
title_full_unstemmed	Block-level Ranking for Intra-Website Pages
title_sort	block-level ranking for intra-website pages
publishDate	2007
url	http://ndltd.ncl.edu.tw/handle/2tvrgw
work_keys_str_mv	AT wenfengyao blocklevelrankingforintrawebsitepages AT yáowénfēng blocklevelrankingforintrawebsitepages AT wenfengyao wǎngzhànnèiwǎngyèzhīqūkuàiděngjífēnxī AT yáowénfēng wǎngzhànnèiwǎngyèzhīqūkuàiděngjífēnxī
_version_	1719094157033603072

Block-level Ranking for Intra-Website Pages

Similar Items