Parallel computation of data cubes

Since its proposal, data cube has attracted a great deal of attention in both academic and industry research communities. Many research papers have been published about different issues related to data cubes and many commercial OLAP (On-Line Analytical Processing) systems have been released to ma...

Full description

Bibliographic Details
Main Author: Momen-Pour, Soroush
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/9746
id ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-9746
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-97462014-03-14T15:43:39Z Parallel computation of data cubes Momen-Pour, Soroush Since its proposal, data cube has attracted a great deal of attention in both academic and industry research communities. Many research papers have been published about different issues related to data cubes and many commercial OLAP (On-Line Analytical Processing) systems have been released to market with data cube operations as their core functions. Several algorithms have been proposed to compute data cubes more efficiently. PIPESORT and PIPEHASH algorithms proposed by Agrawal et. al., OVERLAP algorithm proposed by Naughton et. al., Partitioned-Cube algorithm proposed by Ross et. al. and the Multi-way Array algorithm proposed by Naughton et. al. are the most significant ones. All of these algorithms are designed for implementation on sequential machines, however computing a data cube can be an expensive task. For some organizations it may take a very powerful computer working around the clock for a week to compute all the data cubes they may want to use. Application of parallel processing can speed up this process. Despite the popularity and importance of data cubes, very little research has been carried out on the parallel computation of them. The only parallel algorithm for computation of data cubes, which I am aware of, is the algorithm proposed by Goil et. al.. Their algorithm works for cases where the data set fits in main memory, however, real world data sets rarely fit in main memory. The wide spread availability of inexpensive cluster machines makes it possible to use parallel processing for computation of data cubes, even in small size firms and as a result there could be a real demand for efficient parallel data cube construction algorithms. I have designed and implemented two parallel data cube computation algorithms (Parallel Partitioned-Cube algorithm and Parallel Single-pass Multi-way Array algorithm) based on sequential algorithms proposed in literature. The former algorithm is classified as a ROLAP (Relational OLAP) algorithm and the second one is considered as a MOLAP (Multi-dimensional OLAP) algorithm. 2009-06-26T23:13:08Z 2009-06-26T23:13:08Z 1999 2009-06-26T23:13:08Z 1999-11 Electronic Thesis or Dissertation http://hdl.handle.net/2429/9746 eng UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/]
collection NDLTD
language English
sources NDLTD
description Since its proposal, data cube has attracted a great deal of attention in both academic and industry research communities. Many research papers have been published about different issues related to data cubes and many commercial OLAP (On-Line Analytical Processing) systems have been released to market with data cube operations as their core functions. Several algorithms have been proposed to compute data cubes more efficiently. PIPESORT and PIPEHASH algorithms proposed by Agrawal et. al., OVERLAP algorithm proposed by Naughton et. al., Partitioned-Cube algorithm proposed by Ross et. al. and the Multi-way Array algorithm proposed by Naughton et. al. are the most significant ones. All of these algorithms are designed for implementation on sequential machines, however computing a data cube can be an expensive task. For some organizations it may take a very powerful computer working around the clock for a week to compute all the data cubes they may want to use. Application of parallel processing can speed up this process. Despite the popularity and importance of data cubes, very little research has been carried out on the parallel computation of them. The only parallel algorithm for computation of data cubes, which I am aware of, is the algorithm proposed by Goil et. al.. Their algorithm works for cases where the data set fits in main memory, however, real world data sets rarely fit in main memory. The wide spread availability of inexpensive cluster machines makes it possible to use parallel processing for computation of data cubes, even in small size firms and as a result there could be a real demand for efficient parallel data cube construction algorithms. I have designed and implemented two parallel data cube computation algorithms (Parallel Partitioned-Cube algorithm and Parallel Single-pass Multi-way Array algorithm) based on sequential algorithms proposed in literature. The former algorithm is classified as a ROLAP (Relational OLAP) algorithm and the second one is considered as a MOLAP (Multi-dimensional OLAP) algorithm.
author Momen-Pour, Soroush
spellingShingle Momen-Pour, Soroush
Parallel computation of data cubes
author_facet Momen-Pour, Soroush
author_sort Momen-Pour, Soroush
title Parallel computation of data cubes
title_short Parallel computation of data cubes
title_full Parallel computation of data cubes
title_fullStr Parallel computation of data cubes
title_full_unstemmed Parallel computation of data cubes
title_sort parallel computation of data cubes
publishDate 2009
url http://hdl.handle.net/2429/9746
work_keys_str_mv AT momenpoursoroush parallelcomputationofdatacubes
_version_ 1716651802952204288