The Summarization of hierarchical data with exceptions
In many applications of OLAP or data warehouse, users need to query data of interest, such as a set of data that satisfies specific properties. A normal answer to such query just enumerates all the interesting cells. This is the most accurate but not the most informative method. Summarizations need...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
2009
|
Online Access: | http://hdl.handle.net/2429/15511 |
id |
ndltd-UBC-oai-circle.library.ubc.ca-2429-15511 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UBC-oai-circle.library.ubc.ca-2429-155112018-01-05T17:37:50Z The Summarization of hierarchical data with exceptions Bu, Shaofeng In many applications of OLAP or data warehouse, users need to query data of interest, such as a set of data that satisfies specific properties. A normal answer to such query just enumerates all the interesting cells. This is the most accurate but not the most informative method. Summarizations need to be done in order to return more concise descriptions of these interesting cells to the users. MDL approach has been applied on the hierarchical data to get concise descriptions. However in many cases the descriptions are not concise enough to the users. Another method, GMDL, can generate much shorter descriptions, but the GMDL descriptions are not truly pure. The motivation of our research is to overcome the disadvantages in the above methods. In this thesis, we bring up a methodology that focuses on generating the summarization with exceptions of the hierarchical data. We extend the MDL approach to include some exceptions in the description. The exceptios are some uninteresting cells. The result shows that the description with exceptions is pure, which means that the description only covers "interesting cells". We call this new approach MDLE, i.e. MDL with exceptions. Our new approach aims to find the shortest description with exceptions to cover all "interesting cells". Firstly, we study two simple cases that can be solved in polynomial time and we give the algorithms. Secondly, we prove that MDL with exceptions is an NP-Hard problem in general cases and we propose three heuristics. Finally, we show some experiments that we have done to compare MDLE with MDL and GMDL . The experiment results show that MDLE generates more concise descriptions than MDL and meantime MDLE gets shorter descriptions than GMDL when the white-ratio is low or there are some red cells. Science, Faculty of Computer Science, Department of Graduate 2009-11-21T20:58:49Z 2009-11-21T20:58:49Z 2004 2004-11 Text Thesis/Dissertation http://hdl.handle.net/2429/15511 eng For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. 3304718 bytes application/pdf |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
description |
In many applications of OLAP or data warehouse, users need to query data of interest, such as a set of data that satisfies specific properties. A normal answer to such query just enumerates all the interesting cells. This is the most accurate but not the most informative method. Summarizations need to be done in order to return more concise descriptions of these interesting cells to the users. MDL approach has been applied on the hierarchical data to get concise descriptions. However in many cases the descriptions are not concise enough to the users. Another method, GMDL, can generate much shorter descriptions, but the GMDL descriptions are not truly pure. The motivation of our research is to overcome the disadvantages in the above methods. In this thesis, we bring up a methodology that focuses on generating the summarization with exceptions of the hierarchical data. We extend the MDL approach to include some exceptions in the description. The exceptios are some uninteresting cells. The result shows that the description with exceptions is pure, which means that the description only covers "interesting cells". We call this new approach MDLE, i.e. MDL with exceptions. Our new approach aims to find the shortest description with exceptions to cover all "interesting cells". Firstly, we study two simple cases that can be solved in polynomial time and we give the algorithms. Secondly, we prove that MDL with exceptions is an NP-Hard problem in general cases and we propose three heuristics. Finally, we show some experiments that we have done to compare MDLE with MDL and GMDL . The experiment results show that MDLE generates more concise descriptions than MDL and meantime MDLE gets shorter descriptions than GMDL when the white-ratio is low or there are some red cells. === Science, Faculty of === Computer Science, Department of === Graduate |
author |
Bu, Shaofeng |
spellingShingle |
Bu, Shaofeng The Summarization of hierarchical data with exceptions |
author_facet |
Bu, Shaofeng |
author_sort |
Bu, Shaofeng |
title |
The Summarization of hierarchical data with exceptions |
title_short |
The Summarization of hierarchical data with exceptions |
title_full |
The Summarization of hierarchical data with exceptions |
title_fullStr |
The Summarization of hierarchical data with exceptions |
title_full_unstemmed |
The Summarization of hierarchical data with exceptions |
title_sort |
summarization of hierarchical data with exceptions |
publishDate |
2009 |
url |
http://hdl.handle.net/2429/15511 |
work_keys_str_mv |
AT bushaofeng thesummarizationofhierarchicaldatawithexceptions AT bushaofeng summarizationofhierarchicaldatawithexceptions |
_version_ |
1718589931135172608 |