The Study of Index on Semi-structured Data
碩士 === 大葉大學 === 資訊工程學系碩士班 === 91 === As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. Thi...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2003
|
Online Access: | http://ndltd.ncl.edu.tw/handle/80296596376088375660 |
id |
ndltd-TW-091DYU00392009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-091DYU003920092015-10-13T17:01:16Z http://ndltd.ncl.edu.tw/handle/80296596376088375660 The Study of Index on Semi-structured Data 半結構性資料索引之研究 Chi-Lan Lin 林錡嵐 碩士 大葉大學 資訊工程學系碩士班 91 As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. This kind of data is called semi-structured data since they don’t have fixed schema and incomplete schema is allowed among data. As the data volume increases dramatically, a new challenge of fast data retrieval is posed to the database researchers. Object Exchange Model, or OEM, normally models the semi-structured data. OEM is a graph data structure, in which the data attributes are represented by the edges of paths and the data are stored at the end nodes of the paths. Since the paths in a graph may differ from each other, the traditional indexing systems designed for fixed-schema data in relational databases are not suitable for this new type of data type. In order to accelerate semi-structured data retrieval, in this research we provide a new concept called Path Merge Graph, or PMG. PMG is based on the graph data structure to build indices on semi-structured data. To reduce the space required to store the indices, PMG utilizes paths to allow more than one paths embedded in a single path. By reducing the index size, yet still storing enough indexing information, the overhead of search index can be reduced as well. In this research, we provide the special data structure for storing the new indices and the functions, such as insertion, deletion, and updating, to maintain the index system. Andy Chiou 邱紹豐 2003 學位論文 ; thesis 60 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 大葉大學 === 資訊工程學系碩士班 === 91 === As the Internet is becoming more important and treated as the data repository, traditional relational data model is insufficient to describe and integrate the heterogeneous data on the web, such as web page, e-mail, news group documents, and so on. This kind of data is called semi-structured data since they don’t have fixed schema and incomplete schema is allowed among data. As the data volume increases dramatically, a new challenge of fast data retrieval is posed to the database researchers. Object Exchange Model, or OEM, normally models the semi-structured data. OEM is a graph data structure, in which the data attributes are represented by the edges of paths and the data are stored at the end nodes of the paths. Since the paths in a graph may differ from each other, the traditional indexing systems designed for fixed-schema data in relational databases are not suitable for this new type of data type. In order to accelerate semi-structured data retrieval, in this research we provide a new concept called Path Merge Graph, or PMG. PMG is based on the graph data structure to build indices on semi-structured data. To reduce the space required to store the indices, PMG utilizes paths to allow more than one paths embedded in a single path. By reducing the index size, yet still storing enough indexing information, the overhead of search index can be reduced as well. In this research, we provide the special data structure for storing the new indices and the functions, such as insertion, deletion, and updating, to maintain the index system.
|
author2 |
Andy Chiou |
author_facet |
Andy Chiou Chi-Lan Lin 林錡嵐 |
author |
Chi-Lan Lin 林錡嵐 |
spellingShingle |
Chi-Lan Lin 林錡嵐 The Study of Index on Semi-structured Data |
author_sort |
Chi-Lan Lin |
title |
The Study of Index on Semi-structured Data |
title_short |
The Study of Index on Semi-structured Data |
title_full |
The Study of Index on Semi-structured Data |
title_fullStr |
The Study of Index on Semi-structured Data |
title_full_unstemmed |
The Study of Index on Semi-structured Data |
title_sort |
study of index on semi-structured data |
publishDate |
2003 |
url |
http://ndltd.ncl.edu.tw/handle/80296596376088375660 |
work_keys_str_mv |
AT chilanlin thestudyofindexonsemistructureddata AT línqílán thestudyofindexonsemistructureddata AT chilanlin bànjiégòuxìngzīliàosuǒyǐnzhīyánjiū AT línqílán bànjiégòuxìngzīliàosuǒyǐnzhīyánjiū AT chilanlin studyofindexonsemistructureddata AT línqílán studyofindexonsemistructureddata |
_version_ |
1717777762619490304 |