Dynamic cubing for hierarchical multidimensional data space

Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as i...

Full description

Bibliographic Details
Main Author: Ahmed, Usman
Language:ENG
Published: INSA de Lyon 2013
Subjects:
Online Access:http://tel.archives-ouvertes.fr/tel-00876624
http://tel.archives-ouvertes.fr/docs/00/87/66/24/PDF/these.pdf
id ndltd-CCSD-oai-tel.archives-ouvertes.fr-tel-00876624
record_format oai_dc
spelling ndltd-CCSD-oai-tel.archives-ouvertes.fr-tel-008766242013-11-05T03:20:02Z http://tel.archives-ouvertes.fr/tel-00876624 2013ISAL0011 http://tel.archives-ouvertes.fr/docs/00/87/66/24/PDF/these.pdf Dynamic cubing for hierarchical multidimensional data space Ahmed, Usman [INFO:INFO_OH] Computer Science/Other Information Technology Data warehouse Real time Olap Data cube Partial view materialization Multidimentional data indexing Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as intelligent building, smart grid etc.) that require the latest data for decision making. These modern applications necessitate real-time fast atomic integration of incoming facts in data warehouse. Moreover, the data defining the analysis dimensions, stored in dimension tables of these warehouses, also needs to be updated in real-time, in case of any change. In this thesis, such real-time data warehouses are defined as dynamic data warehouses. We propose a data model for these dynamic data warehouses and present the concept of Hierarchical Hybrid Multidimensional Data Space (HHMDS) which constitutes of both ordered and non-ordered hierarchical dimensions. The axes of the data space are non-ordered which help their dynamic evolution without any need of reordering. We define a data grouping structure, called Minimum Bounding Space (MBS), that helps efficient data partitioning of data in the space. Various operators, relations and metrics are defined which are used for the optimization of these data partitions and the analogies among classical OLAP concepts and the HHMDS are defined. We propose efficient algorithms to store summarized or detailed data, in form of MBS, in a tree structure called DyTree. Algorithms for OLAP queries over the DyTree are also detailed. The nodes of DyTree, holding MBS with associated aggregated measure values, represent materialized sections of cuboids and tree as a whole is a partially materialized and indexed data cube which is maintained using online atomic incremental updates. We propose a methodology to experimentally evaluate partial data cubing techniques and a prototype implementing this methodology is developed. The prototype lets us experimentally evaluate and simulate the structure and performance of the DyTree against other solutions. An extensive study is conducted using this prototype which shows that the DyTree is an efficient and effective partial data cubing solution for a dynamic data warehousing environment. 2013-02-18 ENG PhD thesis INSA de Lyon
collection NDLTD
language ENG
sources NDLTD
topic [INFO:INFO_OH] Computer Science/Other
Information Technology
Data warehouse
Real time
Olap
Data cube
Partial view materialization
Multidimentional data indexing
spellingShingle [INFO:INFO_OH] Computer Science/Other
Information Technology
Data warehouse
Real time
Olap
Data cube
Partial view materialization
Multidimentional data indexing
Ahmed, Usman
Dynamic cubing for hierarchical multidimensional data space
description Data warehouses are being used in many applications since quite a long time. Traditionally, new data in these warehouses is loaded through offline bulk updates which implies that latest data is not always available for analysis. This, however, is not acceptable in many modern applications (such as intelligent building, smart grid etc.) that require the latest data for decision making. These modern applications necessitate real-time fast atomic integration of incoming facts in data warehouse. Moreover, the data defining the analysis dimensions, stored in dimension tables of these warehouses, also needs to be updated in real-time, in case of any change. In this thesis, such real-time data warehouses are defined as dynamic data warehouses. We propose a data model for these dynamic data warehouses and present the concept of Hierarchical Hybrid Multidimensional Data Space (HHMDS) which constitutes of both ordered and non-ordered hierarchical dimensions. The axes of the data space are non-ordered which help their dynamic evolution without any need of reordering. We define a data grouping structure, called Minimum Bounding Space (MBS), that helps efficient data partitioning of data in the space. Various operators, relations and metrics are defined which are used for the optimization of these data partitions and the analogies among classical OLAP concepts and the HHMDS are defined. We propose efficient algorithms to store summarized or detailed data, in form of MBS, in a tree structure called DyTree. Algorithms for OLAP queries over the DyTree are also detailed. The nodes of DyTree, holding MBS with associated aggregated measure values, represent materialized sections of cuboids and tree as a whole is a partially materialized and indexed data cube which is maintained using online atomic incremental updates. We propose a methodology to experimentally evaluate partial data cubing techniques and a prototype implementing this methodology is developed. The prototype lets us experimentally evaluate and simulate the structure and performance of the DyTree against other solutions. An extensive study is conducted using this prototype which shows that the DyTree is an efficient and effective partial data cubing solution for a dynamic data warehousing environment.
author Ahmed, Usman
author_facet Ahmed, Usman
author_sort Ahmed, Usman
title Dynamic cubing for hierarchical multidimensional data space
title_short Dynamic cubing for hierarchical multidimensional data space
title_full Dynamic cubing for hierarchical multidimensional data space
title_fullStr Dynamic cubing for hierarchical multidimensional data space
title_full_unstemmed Dynamic cubing for hierarchical multidimensional data space
title_sort dynamic cubing for hierarchical multidimensional data space
publisher INSA de Lyon
publishDate 2013
url http://tel.archives-ouvertes.fr/tel-00876624
http://tel.archives-ouvertes.fr/docs/00/87/66/24/PDF/these.pdf
work_keys_str_mv AT ahmedusman dynamiccubingforhierarchicalmultidimensionaldataspace
_version_ 1716612790387474432