Design and Analysis of Nearest Neighbor Search Strategies

碩士 === 國立中山大學 === 資訊工程學系研究所 === 90 === With the proliferation of wireless communications and rapid advances in technologies, algorithms for efficiently answering queries about large number of spatial data are needed. Spatial data consists of spatial objects including data of higher dimension. Neighb...

Full description

Bibliographic Details
Main Authors: Hue-Ling Chen, 陳慧玲
Other Authors: Ye-In Chang
Format: Others
Language:en_US
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/87412624225347213682
id ndltd-TW-090NSYS5392014
record_format oai_dc
spelling ndltd-TW-090NSYS53920142015-10-13T10:28:29Z http://ndltd.ncl.edu.tw/handle/87412624225347213682 Design and Analysis of Nearest Neighbor Search Strategies 尋找最近鄰居方法之設計與分析 Hue-Ling Chen 陳慧玲 碩士 國立中山大學 資訊工程學系研究所 90 With the proliferation of wireless communications and rapid advances in technologies, algorithms for efficiently answering queries about large number of spatial data are needed. Spatial data consists of spatial objects including data of higher dimension. Neighbor finding is one of the most important spatial operations in the field of spatial data structures. In recent years, many researchers have focused on finding efficient solutions to the nearest neighbor problem (NN) which involves determining the point in a data set that is the nearest to a given query point. It is frequently used in Geographical Information Systems (GIS). A block B is said to be the neighbor of another block A, if block B has the same property as block A has and covers an equal-sized neighbor of block A. Jozef Voros has proposed a neighbor finding strategy on images represented by quadtrees, in which the four equal-sized neighbors (the east, west, north, and south directions) of block A can be found. However, based on Voros''s strategy, the case that the nearest neighbor occurs in the diagonal directions (the northeast, northwest, southeast, and southwest directions) will be ignored. Moreover, there is no total ordering that preserve proximity when mapping a spatial data from a higher dimensional space to a 1D-space. One way of effecting such a mapping is to utilize space-filling curves. Space-filling curves pass through every point in a space and give a one-one correspondence between the coordinate and the 1D-sequence number of the point. The Peano curve, proposed by Orenstein, which maps the 1D-coordinate of a point by simply interleaving the bits of the X and Y coordinates in the 2D-space, can be easily used in neighbor finding. But with the data ordered by the RBG curve or the Hilbert curve, the neighbor finding would be complex. The RBG curve achieves savings in random accesses on the disk for range queries and the Hilbert curve achieves the best clustering for range queries. Therefore, in this thesis, we first show the missing case in the Voros''s strategy and show the ways to find it. Next, we show that the Peano curve is the best mapping function used in the nearest neighbor finding. We also show the transformation rules between the Peano curve and the other curves such that we can efficiently find the nearest neighbor, when the data is linearly ordered by the other curves. From our simulation, we show that our proposed two strategies can work correctly and faster than the conventional strategies in nearest neighbor finding. Finally, we present a revised version of NA-Trees, which can work for exact match queries and range queries from a large, dynamic index, where an exact match query means finding the specific data object in a spatial database and a range query means reporting all data objects which are located in a specific range. By large, we mean that most of the index must be stored in secondary memory. By dynamic, we mean that insertions and deletions are intermixed with queries, so that the index cannot be built beforehand. Ye-In Chang 張玉盈 2002 學位論文 ; thesis 105 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中山大學 === 資訊工程學系研究所 === 90 === With the proliferation of wireless communications and rapid advances in technologies, algorithms for efficiently answering queries about large number of spatial data are needed. Spatial data consists of spatial objects including data of higher dimension. Neighbor finding is one of the most important spatial operations in the field of spatial data structures. In recent years, many researchers have focused on finding efficient solutions to the nearest neighbor problem (NN) which involves determining the point in a data set that is the nearest to a given query point. It is frequently used in Geographical Information Systems (GIS). A block B is said to be the neighbor of another block A, if block B has the same property as block A has and covers an equal-sized neighbor of block A. Jozef Voros has proposed a neighbor finding strategy on images represented by quadtrees, in which the four equal-sized neighbors (the east, west, north, and south directions) of block A can be found. However, based on Voros''s strategy, the case that the nearest neighbor occurs in the diagonal directions (the northeast, northwest, southeast, and southwest directions) will be ignored. Moreover, there is no total ordering that preserve proximity when mapping a spatial data from a higher dimensional space to a 1D-space. One way of effecting such a mapping is to utilize space-filling curves. Space-filling curves pass through every point in a space and give a one-one correspondence between the coordinate and the 1D-sequence number of the point. The Peano curve, proposed by Orenstein, which maps the 1D-coordinate of a point by simply interleaving the bits of the X and Y coordinates in the 2D-space, can be easily used in neighbor finding. But with the data ordered by the RBG curve or the Hilbert curve, the neighbor finding would be complex. The RBG curve achieves savings in random accesses on the disk for range queries and the Hilbert curve achieves the best clustering for range queries. Therefore, in this thesis, we first show the missing case in the Voros''s strategy and show the ways to find it. Next, we show that the Peano curve is the best mapping function used in the nearest neighbor finding. We also show the transformation rules between the Peano curve and the other curves such that we can efficiently find the nearest neighbor, when the data is linearly ordered by the other curves. From our simulation, we show that our proposed two strategies can work correctly and faster than the conventional strategies in nearest neighbor finding. Finally, we present a revised version of NA-Trees, which can work for exact match queries and range queries from a large, dynamic index, where an exact match query means finding the specific data object in a spatial database and a range query means reporting all data objects which are located in a specific range. By large, we mean that most of the index must be stored in secondary memory. By dynamic, we mean that insertions and deletions are intermixed with queries, so that the index cannot be built beforehand.
author2 Ye-In Chang
author_facet Ye-In Chang
Hue-Ling Chen
陳慧玲
author Hue-Ling Chen
陳慧玲
spellingShingle Hue-Ling Chen
陳慧玲
Design and Analysis of Nearest Neighbor Search Strategies
author_sort Hue-Ling Chen
title Design and Analysis of Nearest Neighbor Search Strategies
title_short Design and Analysis of Nearest Neighbor Search Strategies
title_full Design and Analysis of Nearest Neighbor Search Strategies
title_fullStr Design and Analysis of Nearest Neighbor Search Strategies
title_full_unstemmed Design and Analysis of Nearest Neighbor Search Strategies
title_sort design and analysis of nearest neighbor search strategies
publishDate 2002
url http://ndltd.ncl.edu.tw/handle/87412624225347213682
work_keys_str_mv AT huelingchen designandanalysisofnearestneighborsearchstrategies
AT chénhuìlíng designandanalysisofnearestneighborsearchstrategies
AT huelingchen xúnzhǎozuìjìnlínjūfāngfǎzhīshèjìyǔfēnxī
AT chénhuìlíng xúnzhǎozuìjìnlínjūfāngfǎzhīshèjìyǔfēnxī
_version_ 1716828525634256896