A Study on Top-k Dominance in Metric Space using R-trees

碩士 === 國立臺北科技大學 === 資訊工程系研究所 === 104 === Top-k dominating queries are an important tool for ‘similarity search’ in database and decision support applications. A top-k dominating query returns k data items with the highest dominance in a dataset. It combines the advantages of two powerful preference...

Full description

Bibliographic Details
Main Author: Muzwandile Z. W. Makhubu
Other Authors: 劉傳銘
Format: Others
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/zhnjv6
id ndltd-TW-104TIT05392020
record_format oai_dc
spelling ndltd-TW-104TIT053920202019-05-15T22:54:23Z http://ndltd.ncl.edu.tw/handle/zhnjv6 A Study on Top-k Dominance in Metric Space using R-trees 使用R-tree進行度量空間中頂端k物件處裡之探討 Muzwandile Z. W. Makhubu Muzwandile Z. W. Makhubu 碩士 國立臺北科技大學 資訊工程系研究所 104 Top-k dominating queries are an important tool for ‘similarity search’ in database and decision support applications. A top-k dominating query returns k data items with the highest dominance in a dataset. It combines the advantages of two powerful preference query techniques – the top-k query and the skyline query. It does so while mitigating their individual disadvantages. A great deal of work has been done on solving the top-k dominating query problem on a multivariate dataset where data items are defined as multidimensional points. Most of the work has handled the case where the dataset is static and to a lesser extent distance-based dynamic data. In this work the top-k dominating query is performed over metric space data which poses different challenges to those encountered in the abovementioned multivariate data. In this scenario we have data objects and their distances to a set of input query objects, and these distances can change dynamically as input query objects are generated. Two algorithms are developed to solve the problem of top-k dominating queries in metric space. Typically metric space index structures such as the M-tree would be used in such a situation. In this work we show how to efficiently use R-trees in a metric instead. Moreover the paper also investigates means to reduce the memory footprint of the processing algorithm. This is an important direction as this makes the top-k dominating query solution applicable to wireless broadcast environments where the processing nodes may have limited resource (e.g. wireless sensor networks). We were able to show that the R-tree can be effectively used in indexing data that would typically be indexed using metric space indexes. Moreover the algorithms described are capable of finding the top-k dominating results without first finding the exact dominance score of an object. The two algorithms proposed are called Direct-Top-k Dominating (D-TKD) Query and Enhanced-Top-k Dominating (E-TKD) Query algorithm. The D-TKD algorithm solves the problem without the use of any sophisticated indexing scheme, while the E-TKD employs the R-tree as an index. We show the performance of these two algorithms for different; dataset size, query set size and size of result. We demonstrate the performance improvement that is a result of using the R-tree index for the E-TKD method. 劉傳銘 2016 學位論文 ; thesis 0
collection NDLTD
format Others
sources NDLTD
description 碩士 === 國立臺北科技大學 === 資訊工程系研究所 === 104 === Top-k dominating queries are an important tool for ‘similarity search’ in database and decision support applications. A top-k dominating query returns k data items with the highest dominance in a dataset. It combines the advantages of two powerful preference query techniques – the top-k query and the skyline query. It does so while mitigating their individual disadvantages. A great deal of work has been done on solving the top-k dominating query problem on a multivariate dataset where data items are defined as multidimensional points. Most of the work has handled the case where the dataset is static and to a lesser extent distance-based dynamic data. In this work the top-k dominating query is performed over metric space data which poses different challenges to those encountered in the abovementioned multivariate data. In this scenario we have data objects and their distances to a set of input query objects, and these distances can change dynamically as input query objects are generated. Two algorithms are developed to solve the problem of top-k dominating queries in metric space. Typically metric space index structures such as the M-tree would be used in such a situation. In this work we show how to efficiently use R-trees in a metric instead. Moreover the paper also investigates means to reduce the memory footprint of the processing algorithm. This is an important direction as this makes the top-k dominating query solution applicable to wireless broadcast environments where the processing nodes may have limited resource (e.g. wireless sensor networks). We were able to show that the R-tree can be effectively used in indexing data that would typically be indexed using metric space indexes. Moreover the algorithms described are capable of finding the top-k dominating results without first finding the exact dominance score of an object. The two algorithms proposed are called Direct-Top-k Dominating (D-TKD) Query and Enhanced-Top-k Dominating (E-TKD) Query algorithm. The D-TKD algorithm solves the problem without the use of any sophisticated indexing scheme, while the E-TKD employs the R-tree as an index. We show the performance of these two algorithms for different; dataset size, query set size and size of result. We demonstrate the performance improvement that is a result of using the R-tree index for the E-TKD method.
author2 劉傳銘
author_facet 劉傳銘
Muzwandile Z. W. Makhubu
Muzwandile Z. W. Makhubu
author Muzwandile Z. W. Makhubu
Muzwandile Z. W. Makhubu
spellingShingle Muzwandile Z. W. Makhubu
Muzwandile Z. W. Makhubu
A Study on Top-k Dominance in Metric Space using R-trees
author_sort Muzwandile Z. W. Makhubu
title A Study on Top-k Dominance in Metric Space using R-trees
title_short A Study on Top-k Dominance in Metric Space using R-trees
title_full A Study on Top-k Dominance in Metric Space using R-trees
title_fullStr A Study on Top-k Dominance in Metric Space using R-trees
title_full_unstemmed A Study on Top-k Dominance in Metric Space using R-trees
title_sort study on top-k dominance in metric space using r-trees
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/zhnjv6
work_keys_str_mv AT muzwandilezwmakhubu astudyontopkdominanceinmetricspaceusingrtrees
AT muzwandilezwmakhubu astudyontopkdominanceinmetricspaceusingrtrees
AT muzwandilezwmakhubu shǐyòngrtreejìnxíngdùliàngkōngjiānzhōngdǐngduānkwùjiànchùlǐzhītàntǎo
AT muzwandilezwmakhubu shǐyòngrtreejìnxíngdùliàngkōngjiānzhōngdǐngduānkwùjiànchùlǐzhītàntǎo
AT muzwandilezwmakhubu studyontopkdominanceinmetricspaceusingrtrees
AT muzwandilezwmakhubu studyontopkdominanceinmetricspaceusingrtrees
_version_ 1719137746406080512