ICQ-Tree: An Inverted Code Quadtree for the Top-K Spatial Keyword Query over the Incremental Database

碩士 === 國立中山大學 === 資訊工程學系研究所 === 106 === With the rapid development of geo-spatial data, the use of spatial databases has become more and more widespread. For example, if we enter the keyword ‘restaurant’ on Google Map, the system will display a number of restaurants near you, or upload photos with t...

Full description

Bibliographic Details
Main Authors: I-Hsiang Su Wang, 蘇王奕翔
Other Authors: Ye-In Chang
Format: Others
Language:en_US
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/9m98gv
Description
Summary:碩士 === 國立中山大學 === 資訊工程學系研究所 === 106 === With the rapid development of geo-spatial data, the use of spatial databases has become more and more widespread. For example, if we enter the keyword ‘restaurant’ on Google Map, the system will display a number of restaurants near you, or upload photos with the location information in Instagram and uses hashtags to classify photos. The application of the geographic information has brought a lot of convenience and interests to our life. As the number of spatial objects and keyword objects increases, the data storage and the search efficiency become more important. A spatial keyword query consists of the spatial information and the textual information. Spatial information can be expressed as a point, a line, or the shape on a map; a keyword is the description or the user’s comment. In the space, for querying the top-k objects that match the keyword conditions, it is called the top-k spatial keyword query. Zhang et al. propose a data structure called the IL-Quadtree which combines the inverted index and the linear quadtree for reducing the storage size effectively. In order to further improve the efficiency of the top-k spatial keyword query algorithms, they propose a signature filtering technique. During the query process, it checks the signature to verify whether the node is a candidate node or not. This will reduce the times of traces around nodes. However, in their approach, a large number of quadtrees must be built to store data of objects. For n keywords, it creates n trees. When performing a signature check, each IL-Quadtree must be cross-checked. This will take long time to process the data. Therefore, in this thesis, we propose the ICQ-tree algorithm. Our ICQ-tree combines the quadtree and the inverted code that we have improved from the inverted index. The contribution of our approach is as follows. First, we only construct an ICQ-tree to store data, instead of building n quadtrees for the n keywords in the database. Second, each node of the ICQ-tree records the non-repetitive location, and each object is recorded only once. Finally, we enhance each node in the ICQ-tree with the inverted code. We can prune a node and all of its child nodes immediately, once we know that one of the query keywords is definitely not in the node. From our simulation results, we show that our proposed approach is more efficient than the IL-Quadtree algorithm.