M*Ctree: A Multi-Resolution Indexing Structure for XML Data

XML has emerged as a universal data exchange format for disseminating and sharing information, particularly on the World Wide Web. As more XML data are generated, stored, and exchanged, the need to index XML data efficiently for querying purposes is becoming increasingly important. Designing efficie...

Full description

Bibliographic Details
Main Author:	Guruvadoo, Eranna K.
Published:	NSUWorks 2007
Subjects:	Computer Sciences
Online Access:	http://nsuworks.nova.edu/gscis_etd/559

id	ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1558
record_format	oai_dc
spelling	ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-15582016-04-25T19:40:51Z MCtree: A Multi-Resolution Indexing Structure for XML Data Guruvadoo, Eranna K. XML has emerged as a universal data exchange format for disseminating and sharing information, particularly on the World Wide Web. As more XML data are generated, stored, and exchanged, the need to index XML data efficiently for querying purposes is becoming increasingly important. Designing efficient indexing structures for XML data presents serious challenges because any indexing scheme must index the structural a well as the data components of XML documents and provide tight integrations of the two components. This thesis studies XML indexing methods for tree-structured XML documents that can be queried by a subset of XPath expressions. More specifically, this thesis proposes a new main memory index structure, named MCtree, which is an enhanced Ctree. Unlike the Ctree which is constructed solely from the structural and data characteristics of the database, the MCtree includes query workload characteristics in order to speedup query evaluations on frequently used paths and nodes on the index structure. The Ctree index cleverly uses arrays to preserve child-parent relationships among individual data node pairs in a summary tree structure in order to avoid expensive structural join costs. The MCtree combines the use of child-parent links with additional arrays which provide child-ancestor links along frequently used paths to accelerate query evaluations. These child-ancestor links can be pre-computed based on query workloads, added, and removed as needed to reflect changing workload characteristics. Combined with value indexes which are structure-and-content sensitive, the MCtree becomes a multi-resolution index structure optimized for frequently used paths and achieves better overall performance than those index structures which do not consider query workloads. The MCtree trades off extra memory costs to support additional arrays and achieves better execution times for queries along frequently used paths of the index structure. Experiments conducted in this research show that the MCtree achieves better performance than the Ctree for both simple and branching path queries matching index paths with child-ancestor links. The MCtree achieves larger performance gain over the Ctree on index paths which do not contain regular groups. 2007-01-01T08:00:00Z text http://nsuworks.nova.edu/gscis_etd/559 CEC Theses and Dissertations NSUWorks Computer Sciences
collection	NDLTD
sources	NDLTD
topic	Computer Sciences
spellingShingle	Computer Sciences Guruvadoo, Eranna K. M*Ctree: A Multi-Resolution Indexing Structure for XML Data
description	XML has emerged as a universal data exchange format for disseminating and sharing information, particularly on the World Wide Web. As more XML data are generated, stored, and exchanged, the need to index XML data efficiently for querying purposes is becoming increasingly important. Designing efficient indexing structures for XML data presents serious challenges because any indexing scheme must index the structural a well as the data components of XML documents and provide tight integrations of the two components. This thesis studies XML indexing methods for tree-structured XML documents that can be queried by a subset of XPath expressions. More specifically, this thesis proposes a new main memory index structure, named MCtree, which is an enhanced Ctree. Unlike the Ctree which is constructed solely from the structural and data characteristics of the database, the MCtree includes query workload characteristics in order to speedup query evaluations on frequently used paths and nodes on the index structure. The Ctree index cleverly uses arrays to preserve child-parent relationships among individual data node pairs in a summary tree structure in order to avoid expensive structural join costs. The MCtree combines the use of child-parent links with additional arrays which provide child-ancestor links along frequently used paths to accelerate query evaluations. These child-ancestor links can be pre-computed based on query workloads, added, and removed as needed to reflect changing workload characteristics. Combined with value indexes which are structure-and-content sensitive, the MCtree becomes a multi-resolution index structure optimized for frequently used paths and achieves better overall performance than those index structures which do not consider query workloads. The MCtree trades off extra memory costs to support additional arrays and achieves better execution times for queries along frequently used paths of the index structure. Experiments conducted in this research show that the MCtree achieves better performance than the Ctree for both simple and branching path queries matching index paths with child-ancestor links. The M*Ctree achieves larger performance gain over the Ctree on index paths which do not contain regular groups.
author	Guruvadoo, Eranna K.
author_facet	Guruvadoo, Eranna K.
author_sort	Guruvadoo, Eranna K.
title	M*Ctree: A Multi-Resolution Indexing Structure for XML Data
title_short	M*Ctree: A Multi-Resolution Indexing Structure for XML Data
title_full	M*Ctree: A Multi-Resolution Indexing Structure for XML Data
title_fullStr	M*Ctree: A Multi-Resolution Indexing Structure for XML Data
title_full_unstemmed	M*Ctree: A Multi-Resolution Indexing Structure for XML Data
title_sort	m*ctree: a multi-resolution indexing structure for xml data
publisher	NSUWorks
publishDate	2007
url	http://nsuworks.nova.edu/gscis_etd/559
work_keys_str_mv	AT guruvadooerannak mctreeamultiresolutionindexingstructureforxmldata
_version_	1718248548718346240

M*Ctree: A Multi-Resolution Indexing Structure for XML Data

Similar Items