A HTML Rendering-Based Page Segmentation Algorithm ( HRPS )
碩士 === 國立交通大學 === 資訊學院碩士在職專班資訊組 === 98 === According to the statistical datas, Up to 2010, a total of 113 million websites existed, of which 99.9% was established nearly 15 years, the face of such large and high replacement page data, how to effectively use is a very important matter。 For the inform...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2010
|
Online Access: | http://ndltd.ncl.edu.tw/handle/77593930402034234237 |
id |
ndltd-TW-098NCTU5392024 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-098NCTU53920242016-04-18T04:21:48Z http://ndltd.ncl.edu.tw/handle/77593930402034234237 A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) 基於HTML文件佈局之網頁分割演算法 Yu, Ti-fan 余提梵 碩士 國立交通大學 資訊學院碩士在職專班資訊組 98 According to the statistical datas, Up to 2010, a total of 113 million websites existed, of which 99.9% was established nearly 15 years, the face of such large and high replacement page data, how to effectively use is a very important matter。 For the information that we don’t know its location, we usually use search engine to help us to find it out。 And for the information that we do know where it is, we use data extraction to increase the efficiency。 And whether it is a search engine or information extraction tool, to analyze the complex web, the first steps is to split the Web Page to provide subject area of this location, It’s a important thing that how to use this huge database efficiently。 Since 2003 the team released Microsoft Visual Web segmentation algorithm (Vision-based page segmentation: VIPS), many papers are mostly used segmentation based on visual segmentation, However, in recent years, more and more web page Layout design, using DHTML technology-based, the original method of VIPS in the use, they are in the original design did not take into account small defects, though after the study, there are many page segmentation algorithm combined patterns to make up for the use of deficiency。 But since they are using other features of the algorithm to make up for VIPS, so this part of the Visual cues is losing the characteristics of visual segmentation,This paper presents a method, in order to split based on visualization, into the HTML document Rendering features, to solve the visual segmentation in DHTML pages, you may not find the visual Separator problems。 Wu, I-Chen 吳毅成 2010 學位論文 ; thesis 40 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊學院碩士在職專班資訊組 === 98 === According to the statistical datas, Up to 2010, a total of 113 million websites existed, of which 99.9% was established nearly 15 years, the face of such large and high replacement page data, how to effectively use is a very important matter。 For the information that we don’t know its location, we usually use search engine to help us to find it out。 And for the information that we do know where it is, we use data extraction to increase the efficiency。 And whether it is a search engine or information extraction tool, to analyze the complex web, the first steps is to split the Web Page to provide subject area of this location, It’s a important thing that how to use this huge database efficiently。
Since 2003 the team released Microsoft Visual Web segmentation algorithm (Vision-based page segmentation: VIPS), many papers are mostly used segmentation based on visual segmentation, However, in recent years, more and more web page Layout design, using DHTML technology-based, the original method of VIPS in the use, they are in the original design did not take into account small defects, though after the study, there are many page segmentation algorithm combined patterns to make up for the use of deficiency。
But since they are using other features of the algorithm to make up for VIPS, so this part of the Visual cues is losing the characteristics of visual segmentation,This paper presents a method, in order to split based on visualization, into the HTML document Rendering features, to solve the visual segmentation in DHTML pages, you may not find the visual Separator problems。
|
author2 |
Wu, I-Chen |
author_facet |
Wu, I-Chen Yu, Ti-fan 余提梵 |
author |
Yu, Ti-fan 余提梵 |
spellingShingle |
Yu, Ti-fan 余提梵 A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
author_sort |
Yu, Ti-fan |
title |
A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
title_short |
A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
title_full |
A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
title_fullStr |
A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
title_full_unstemmed |
A HTML Rendering-Based Page Segmentation Algorithm ( HRPS ) |
title_sort |
html rendering-based page segmentation algorithm ( hrps ) |
publishDate |
2010 |
url |
http://ndltd.ncl.edu.tw/handle/77593930402034234237 |
work_keys_str_mv |
AT yutifan ahtmlrenderingbasedpagesegmentationalgorithmhrps AT yútífàn ahtmlrenderingbasedpagesegmentationalgorithmhrps AT yutifan jīyúhtmlwénjiànbùjúzhīwǎngyèfēngēyǎnsuànfǎ AT yútífàn jīyúhtmlwénjiànbùjúzhīwǎngyèfēngēyǎnsuànfǎ AT yutifan htmlrenderingbasedpagesegmentationalgorithmhrps AT yútífàn htmlrenderingbasedpagesegmentationalgorithmhrps |
_version_ |
1718226606502182912 |