AKDB-Tree: An Adjustable KDB-tree for Efficiently Supporting Nearest Neighbor Queries in P2P Systems

碩士 === 國立中山大學 === 資訊工程學系研究所 === 96 === In the future, more data intensive applications, such as P2P auction networks, P2P job--search networks, P2P multi--player games, will require the capability to respond to more complex queries such as the nearest neighbor queries involving numerous data types....

Full description

Bibliographic Details
Main Authors: Hung-ze Liu, 劉宏澤
Other Authors: Ye-in Chang
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/q87y38
Description
Summary:碩士 === 國立中山大學 === 資訊工程學系研究所 === 96 === In the future, more data intensive applications, such as P2P auction networks, P2P job--search networks, P2P multi--player games, will require the capability to respond to more complex queries such as the nearest neighbor queries involving numerous data types. For the problem of answering nearest neighbor queries (NN query) for spatial region data in the P2P environment, a quadtree-based structure probably is a good choice. However, the quadtree stores the data in the leaf nodes, resulting in the load unbalance and expensive cost of any query. The MX--CIF quadtree can solve this problem. The MX--CIF quadtree has three properties: controlling efficiently the height of the tree, reducing load unbalance, and reducing the NNquery scope with controlling the value of the radius. Although the P2P MX--CIF quadtree can do the NN query efficiently, it still has some problems as follows: low accuracy of the nearest neighbor query, the expensive cost of the tree construction, the high search cost of the NN query, and load unbalance. In fact, the index structures for the region data can also work for the point data which can be considered as the degenerated case of the region data. Therefore, the KDB--tree which is a well-known algorithm for the point data can be used to reduce load unbalance, but it has the same problem as the quadtree. The data is stored only in the leaf nodes of the KDB--tree. In this thesis, we propose an Adjustable KDB--tree (AKDB--tree) to improve this situation for the P2P system. The AKDB--tree has five properties: reducing load unbalance, low cost of the tree construction, storing the data in the internal nodes and leaf nodes, high accuracy and low search cost of the NN query. The Chord system is a well--known structured P2P system in which the data search is performed by a hash function, instead of flooding used in most of the unstructured P2P system. Since the Chord system is a hash approach, it is easy to deal with peers joining/exiting. Besides, in order to combine AKDB--tree with the Chord system, we design the IDs of the nodes in the AKDB--tree. Each node is hashed to the Chord system by the ID. The IDs can be used to differentiate the edge node in the AKDB-tree is a vertical edge or a horizontal edge and the relative position of two nodes in the 2D space. And, we can calculate the related edge of a region in the 2D space according to the ID of the region. As discussed above, we make use of the property of IDs to reduce the search cost of the NN query by a wide margin. In our simulation study, we compare our method with the P2P MX--CIF quadtree by considering five performance measures under four different situations of the P2P MX--CIF quadtree. From our simulation results, for the NN query, our AKDB-tree can provide the higher accuracy and lower search cost than the P2P MX--CIF quadtree. For the problem of load, our AKDB-tree is more balance than the P2P MX--CIF quadtree. For the time of the tree construction, our AKDB-tree needs shorter time than the P2P MX--CIF quadtree.