Spatial indexing in the Era of Social Media

<p> The rapid adoption of smart phones and the social media boom has increased the interest in location-based services. A new set of applications and popular online services that utilize users' locations have been created, and many ordinary people are increasingly interacting with these s...

Full description

Bibliographic Details
Main Author: Alsubaiee, Sattam
Language:EN
Published: University of California, Irvine 2014
Subjects:
Online Access:http://pqdtopen.proquest.com/#viewpdf?dispub=3642912
id ndltd-PROQUEST-oai-pqdtoai.proquest.com-3642912
record_format oai_dc
spelling ndltd-PROQUEST-oai-pqdtoai.proquest.com-36429122014-10-09T04:08:01Z Spatial indexing in the Era of Social Media Alsubaiee, Sattam Computer Science <p> The rapid adoption of smart phones and the social media boom has increased the interest in location-based services. A new set of applications and popular online services that utilize users' locations have been created, and many ordinary people are increasingly interacting with these services on a daily basis through their smart phones, tablets, cameras, etc., where most of those gadgets come equipped with GPS sensors. The new complex features provided by those applications and the scale of the massive data handled by them impose new and interesting challenges for spatial databases. In this thesis, we present spatial indexing and query processing techniques in response to some of these challenges. </p><p> First, we study how to support approximate keyword search on spatial data. There are many popular websites that support keyword search on their spatial data, such as business listings and photos. In these systems, users may experience difficulties finding the entities they are looking for if they do not know their exact spelling, such as the name of a restaurant. We develop three algorithms for constructing a specialized index that can answer location- based approximate keyword queries, successively improving the time and space efficiency by exploiting the textual and spatial properties of the data. We experimentally demonstrate the efficiency of our techniques on real, large datasets. </p><p> Second, we introduce a framework for converting an in-place update, disk-based data structure to a deferred-update, append-only data structure. We show that converting an R-tree index (and other non-totally ordered index) to an LSM index is non-trivial if the resultant index is expected to have performant read and write operations. Our framework enables the "LSM-ification" of any kind of index structure that supports certain primitive operations, enabling the index to ingest data efficiently. We have implemented our framework in the context of the AsterixDB system as a way to extend both the R-tree and the inverted keyword index to LSM-based indexes. Our results have shown that using an LSM-based version of the R-tree can significantly outperform its conventional counterpart for <i>both</i> ingestion and query speed. </p><p> Third, we study how to optimize the performance of query workloads that favor recent data. There are many use cases where users of a database system are mostly interested in querying recent data. We propose a solution that exploits the natural partitioning property that LSM-based indexes provide for its components, allowing us to filter out many components when answering queries. Our solution is generalizable to any LSM-based index structure including LSM R-trees, and has been implemented in the context of the AsterixDB system. Our experiments show that we can reduce query times by up to 99% for selective range predicates.</p> University of California, Irvine 2014-10-08 00:00:00.0 thesis http://pqdtopen.proquest.com/#viewpdf?dispub=3642912 EN
collection NDLTD
language EN
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Alsubaiee, Sattam
Spatial indexing in the Era of Social Media
description <p> The rapid adoption of smart phones and the social media boom has increased the interest in location-based services. A new set of applications and popular online services that utilize users' locations have been created, and many ordinary people are increasingly interacting with these services on a daily basis through their smart phones, tablets, cameras, etc., where most of those gadgets come equipped with GPS sensors. The new complex features provided by those applications and the scale of the massive data handled by them impose new and interesting challenges for spatial databases. In this thesis, we present spatial indexing and query processing techniques in response to some of these challenges. </p><p> First, we study how to support approximate keyword search on spatial data. There are many popular websites that support keyword search on their spatial data, such as business listings and photos. In these systems, users may experience difficulties finding the entities they are looking for if they do not know their exact spelling, such as the name of a restaurant. We develop three algorithms for constructing a specialized index that can answer location- based approximate keyword queries, successively improving the time and space efficiency by exploiting the textual and spatial properties of the data. We experimentally demonstrate the efficiency of our techniques on real, large datasets. </p><p> Second, we introduce a framework for converting an in-place update, disk-based data structure to a deferred-update, append-only data structure. We show that converting an R-tree index (and other non-totally ordered index) to an LSM index is non-trivial if the resultant index is expected to have performant read and write operations. Our framework enables the "LSM-ification" of any kind of index structure that supports certain primitive operations, enabling the index to ingest data efficiently. We have implemented our framework in the context of the AsterixDB system as a way to extend both the R-tree and the inverted keyword index to LSM-based indexes. Our results have shown that using an LSM-based version of the R-tree can significantly outperform its conventional counterpart for <i>both</i> ingestion and query speed. </p><p> Third, we study how to optimize the performance of query workloads that favor recent data. There are many use cases where users of a database system are mostly interested in querying recent data. We propose a solution that exploits the natural partitioning property that LSM-based indexes provide for its components, allowing us to filter out many components when answering queries. Our solution is generalizable to any LSM-based index structure including LSM R-trees, and has been implemented in the context of the AsterixDB system. Our experiments show that we can reduce query times by up to 99% for selective range predicates.</p>
author Alsubaiee, Sattam
author_facet Alsubaiee, Sattam
author_sort Alsubaiee, Sattam
title Spatial indexing in the Era of Social Media
title_short Spatial indexing in the Era of Social Media
title_full Spatial indexing in the Era of Social Media
title_fullStr Spatial indexing in the Era of Social Media
title_full_unstemmed Spatial indexing in the Era of Social Media
title_sort spatial indexing in the era of social media
publisher University of California, Irvine
publishDate 2014
url http://pqdtopen.proquest.com/#viewpdf?dispub=3642912
work_keys_str_mv AT alsubaieesattam spatialindexingintheeraofsocialmedia
_version_ 1716716352942637056