Schema Matching for Unsupervised Wrapper Maintenance

碩士 === 國立中央大學 === 資訊工程研究所 === 99 === Wrapper refers to program which is used to extract the specific data in web page, researchers can access specific data by wrapper and use information integration to transfer the data to be useful information, then provide a set of integrated network services, sys...

Full description

Bibliographic Details
Main Authors: Chi-I Kuan, 官直毅
Other Authors: Chia-Hui Chang
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/86832735537308239594
id ndltd-TW-099NCU05392116
record_format oai_dc
spelling ndltd-TW-099NCU053921162017-07-14T04:27:44Z http://ndltd.ncl.edu.tw/handle/86832735537308239594 Schema Matching for Unsupervised Wrapper Maintenance 非監督式包覆程式維護之綱要對映 Chi-I Kuan 官直毅 碩士 國立中央大學 資訊工程研究所 99 Wrapper refers to program which is used to extract the specific data in web page, researchers can access specific data by wrapper and use information integration to transfer the data to be useful information, then provide a set of integrated network services, systems or data analysis system. But the site developers often modify the website because of different needs, this making the original wrapper error that can’t extract data. At this situation, the program developer can just re-write or modify original wrapper to solve. For this reason, unsupervised wrapper induction is widely discussed in recent years. It builds extracted module automatically by the regularity of the dynamic web page and extracted data by such module, so programmer don’t need to write wrapper for specific website every time. The problem unsupervised wrapper induction may encounter is its maintenance. If the website changes by time, we will have two extracted data at time t and at time t’. How to identify the related information and integrate them is our goal. We use the instance and structure information which generated by FiVatech (the unsupervised wrapper induction tool we used) to match the correlation attribute. Chia-Hui Chang 張嘉惠 2011 學位論文 ; thesis 47 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 資訊工程研究所 === 99 === Wrapper refers to program which is used to extract the specific data in web page, researchers can access specific data by wrapper and use information integration to transfer the data to be useful information, then provide a set of integrated network services, systems or data analysis system. But the site developers often modify the website because of different needs, this making the original wrapper error that can’t extract data. At this situation, the program developer can just re-write or modify original wrapper to solve. For this reason, unsupervised wrapper induction is widely discussed in recent years. It builds extracted module automatically by the regularity of the dynamic web page and extracted data by such module, so programmer don’t need to write wrapper for specific website every time. The problem unsupervised wrapper induction may encounter is its maintenance. If the website changes by time, we will have two extracted data at time t and at time t’. How to identify the related information and integrate them is our goal. We use the instance and structure information which generated by FiVatech (the unsupervised wrapper induction tool we used) to match the correlation attribute.
author2 Chia-Hui Chang
author_facet Chia-Hui Chang
Chi-I Kuan
官直毅
author Chi-I Kuan
官直毅
spellingShingle Chi-I Kuan
官直毅
Schema Matching for Unsupervised Wrapper Maintenance
author_sort Chi-I Kuan
title Schema Matching for Unsupervised Wrapper Maintenance
title_short Schema Matching for Unsupervised Wrapper Maintenance
title_full Schema Matching for Unsupervised Wrapper Maintenance
title_fullStr Schema Matching for Unsupervised Wrapper Maintenance
title_full_unstemmed Schema Matching for Unsupervised Wrapper Maintenance
title_sort schema matching for unsupervised wrapper maintenance
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/86832735537308239594
work_keys_str_mv AT chiikuan schemamatchingforunsupervisedwrappermaintenance
AT guānzhíyì schemamatchingforunsupervisedwrappermaintenance
AT chiikuan fēijiāndūshìbāofùchéngshìwéihùzhīgāngyàoduìyìng
AT guānzhíyì fēijiāndūshìbāofùchéngshìwéihùzhīgāngyàoduìyìng
_version_ 1718495998312972288