Parallel computation of data cubes
Since its proposal, data cube has attracted a great deal of attention in both academic and industry research communities. Many research papers have been published about different issues related to data cubes and many commercial OLAP (On-Line Analytical Processing) systems have been released to ma...
Main Author: | |
---|---|
Language: | English |
Published: |
2009
|
Online Access: | http://hdl.handle.net/2429/9746 |
id |
ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-9746 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-97462014-03-14T15:43:39Z Parallel computation of data cubes Momen-Pour, Soroush Since its proposal, data cube has attracted a great deal of attention in both academic and industry research communities. Many research papers have been published about different issues related to data cubes and many commercial OLAP (On-Line Analytical Processing) systems have been released to market with data cube operations as their core functions. Several algorithms have been proposed to compute data cubes more efficiently. PIPESORT and PIPEHASH algorithms proposed by Agrawal et. al., OVERLAP algorithm proposed by Naughton et. al., Partitioned-Cube algorithm proposed by Ross et. al. and the Multi-way Array algorithm proposed by Naughton et. al. are the most significant ones. All of these algorithms are designed for implementation on sequential machines, however computing a data cube can be an expensive task. For some organizations it may take a very powerful computer working around the clock for a week to compute all the data cubes they may want to use. Application of parallel processing can speed up this process. Despite the popularity and importance of data cubes, very little research has been carried out on the parallel computation of them. The only parallel algorithm for computation of data cubes, which I am aware of, is the algorithm proposed by Goil et. al.. Their algorithm works for cases where the data set fits in main memory, however, real world data sets rarely fit in main memory. The wide spread availability of inexpensive cluster machines makes it possible to use parallel processing for computation of data cubes, even in small size firms and as a result there could be a real demand for efficient parallel data cube construction algorithms. I have designed and implemented two parallel data cube computation algorithms (Parallel Partitioned-Cube algorithm and Parallel Single-pass Multi-way Array algorithm) based on sequential algorithms proposed in literature. The former algorithm is classified as a ROLAP (Relational OLAP) algorithm and the second one is considered as a MOLAP (Multi-dimensional OLAP) algorithm. 2009-06-26T23:13:08Z 2009-06-26T23:13:08Z 1999 2009-06-26T23:13:08Z 1999-11 Electronic Thesis or Dissertation http://hdl.handle.net/2429/9746 eng UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/] |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
description |
Since its proposal, data cube has attracted a great deal of attention in both academic
and industry research communities. Many research papers have been published
about different issues related to data cubes and many commercial OLAP
(On-Line Analytical Processing) systems have been released to market with data
cube operations as their core functions.
Several algorithms have been proposed to compute data cubes more efficiently.
PIPESORT and PIPEHASH algorithms proposed by Agrawal et. al.,
OVERLAP algorithm proposed by Naughton et. al., Partitioned-Cube algorithm
proposed by Ross et. al. and the Multi-way Array algorithm proposed by Naughton
et. al. are the most significant ones.
All of these algorithms are designed for implementation on sequential machines,
however computing a data cube can be an expensive task. For some organizations
it may take a very powerful computer working around the clock for a week to
compute all the data cubes they may want to use. Application of parallel processing
can speed up this process. Despite the popularity and importance of data cubes,
very little research has been carried out on the parallel computation of them. The
only parallel algorithm for computation of data cubes, which I am aware of, is the
algorithm proposed by Goil et. al.. Their algorithm works for cases where the data
set fits in main memory, however, real world data sets rarely fit in main memory.
The wide spread availability of inexpensive cluster machines makes it possible
to use parallel processing for computation of data cubes, even in small size firms and
as a result there could be a real demand for efficient parallel data cube construction
algorithms. I have designed and implemented two parallel data cube computation
algorithms (Parallel Partitioned-Cube algorithm and Parallel Single-pass Multi-way
Array algorithm) based on sequential algorithms proposed in literature. The former
algorithm is classified as a ROLAP (Relational OLAP) algorithm and the second
one is considered as a MOLAP (Multi-dimensional OLAP) algorithm. |
author |
Momen-Pour, Soroush |
spellingShingle |
Momen-Pour, Soroush Parallel computation of data cubes |
author_facet |
Momen-Pour, Soroush |
author_sort |
Momen-Pour, Soroush |
title |
Parallel computation of data cubes |
title_short |
Parallel computation of data cubes |
title_full |
Parallel computation of data cubes |
title_fullStr |
Parallel computation of data cubes |
title_full_unstemmed |
Parallel computation of data cubes |
title_sort |
parallel computation of data cubes |
publishDate |
2009 |
url |
http://hdl.handle.net/2429/9746 |
work_keys_str_mv |
AT momenpoursoroush parallelcomputationofdatacubes |
_version_ |
1716651802952204288 |