The Study of Index on Semi-structured Data

碩士 === 大葉大學 === 資訊工程學系碩士班 === 91 === As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. Thi...

Full description

Bibliographic Details
Main Authors: Chi-Lan Lin, 林錡嵐
Other Authors: Andy Chiou
Format: Others
Language:zh-TW
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/80296596376088375660
id ndltd-TW-091DYU00392009
record_format oai_dc
spelling ndltd-TW-091DYU003920092015-10-13T17:01:16Z http://ndltd.ncl.edu.tw/handle/80296596376088375660 The Study of Index on Semi-structured Data 半結構性資料索引之研究 Chi-Lan Lin 林錡嵐 碩士 大葉大學 資訊工程學系碩士班 91 As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. This kind of data is called semi-structured data since they don’t have fixed schema and incomplete schema is allowed among data. As the data volume increases dramatically, a new challenge of fast data retrieval is posed to the database researchers. Object Exchange Model, or OEM, normally models the semi-structured data. OEM is a graph data structure, in which the data attributes are represented by the edges of paths and the data are stored at the end nodes of the paths. Since the paths in a graph may differ from each other, the traditional indexing systems designed for fixed-schema data in relational databases are not suitable for this new type of data type. In order to accelerate semi-structured data retrieval, in this research we provide a new concept called Path Merge Graph, or PMG. PMG is based on the graph data structure to build indices on semi-structured data. To reduce the space required to store the indices, PMG utilizes paths to allow more than one paths embedded in a single path. By reducing the index size, yet still storing enough indexing information, the overhead of search index can be reduced as well. In this research, we provide the special data structure for storing the new indices and the functions, such as insertion, deletion, and updating, to maintain the index system. Andy Chiou 邱紹豐 2003 學位論文 ; thesis 60 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 大葉大學 === 資訊工程學系碩士班 === 91 === As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. This kind of data is called semi-structured data since they don’t have fixed schema and incomplete schema is allowed among data. As the data volume increases dramatically, a new challenge of fast data retrieval is posed to the database researchers. Object Exchange Model, or OEM, normally models the semi-structured data. OEM is a graph data structure, in which the data attributes are represented by the edges of paths and the data are stored at the end nodes of the paths. Since the paths in a graph may differ from each other, the traditional indexing systems designed for fixed-schema data in relational databases are not suitable for this new type of data type. In order to accelerate semi-structured data retrieval, in this research we provide a new concept called Path Merge Graph, or PMG. PMG is based on the graph data structure to build indices on semi-structured data. To reduce the space required to store the indices, PMG utilizes paths to allow more than one paths embedded in a single path. By reducing the index size, yet still storing enough indexing information, the overhead of search index can be reduced as well. In this research, we provide the special data structure for storing the new indices and the functions, such as insertion, deletion, and updating, to maintain the index system.
author2 Andy Chiou
author_facet Andy Chiou
Chi-Lan Lin
林錡嵐
author Chi-Lan Lin
林錡嵐
spellingShingle Chi-Lan Lin
林錡嵐
The Study of Index on Semi-structured Data
author_sort Chi-Lan Lin
title The Study of Index on Semi-structured Data
title_short The Study of Index on Semi-structured Data
title_full The Study of Index on Semi-structured Data
title_fullStr The Study of Index on Semi-structured Data
title_full_unstemmed The Study of Index on Semi-structured Data
title_sort study of index on semi-structured data
publishDate 2003
url http://ndltd.ncl.edu.tw/handle/80296596376088375660
work_keys_str_mv AT chilanlin thestudyofindexonsemistructureddata
AT línqílán thestudyofindexonsemistructureddata
AT chilanlin bànjiégòuxìngzīliàosuǒyǐnzhīyánjiū
AT línqílán bànjiégòuxìngzīliàosuǒyǐnzhīyánjiū
AT chilanlin studyofindexonsemistructureddata
AT línqílán studyofindexonsemistructureddata
_version_ 1717777762619490304