Database partitioning strategies for social network data

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 64-66). === In this thesis, I designed, prototyped and benchmarked two different data partitionin...

Full description

Bibliographic Details
Main Author: Moll Thomae, Oscar Ricardo
Other Authors: Stu Hood and Samuel R. Madden.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2013
Subjects:
Online Access:http://hdl.handle.net/1721.1/77449
id ndltd-MIT-oai-dspace.mit.edu-1721.1-77449
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-774492019-05-02T16:28:51Z Database partitioning strategies for social network data Moll Thomae, Oscar Ricardo Stu Hood and Samuel R. Madden. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012. Cataloged from PDF version of thesis. Includes bibliographical references (p. 64-66). In this thesis, I designed, prototyped and benchmarked two different data partitioning strategies for social network type workloads. The first strategy takes advantage of the heavy-tailed degree distributions of social networks to optimize the latency of vertex neighborhood queries. The second strategy takes advantage of the high temporal locality of workloads to improve latencies for vertex neighborhood intersection queries. Both techniques aim to shorten the tail of the latency distribution, while avoiding decreased write performance or reduced system throughput when compared to the default hash partitioning approach. The strategies presented were evaluated using synthetic workloads of my own design as well as real workloads provided by Twitter, and show promising improvements in latency at some cost in system complexity. by Oscar Ricardo Moll Thomae. M.Eng. 2013-03-01T15:06:18Z 2013-03-01T15:06:18Z 2012 2012 Thesis http://hdl.handle.net/1721.1/77449 826515301 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 66 p. application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
spellingShingle Electrical Engineering and Computer Science.
Moll Thomae, Oscar Ricardo
Database partitioning strategies for social network data
description Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 64-66). === In this thesis, I designed, prototyped and benchmarked two different data partitioning strategies for social network type workloads. The first strategy takes advantage of the heavy-tailed degree distributions of social networks to optimize the latency of vertex neighborhood queries. The second strategy takes advantage of the high temporal locality of workloads to improve latencies for vertex neighborhood intersection queries. Both techniques aim to shorten the tail of the latency distribution, while avoiding decreased write performance or reduced system throughput when compared to the default hash partitioning approach. The strategies presented were evaluated using synthetic workloads of my own design as well as real workloads provided by Twitter, and show promising improvements in latency at some cost in system complexity. === by Oscar Ricardo Moll Thomae. === M.Eng.
author2 Stu Hood and Samuel R. Madden.
author_facet Stu Hood and Samuel R. Madden.
Moll Thomae, Oscar Ricardo
author Moll Thomae, Oscar Ricardo
author_sort Moll Thomae, Oscar Ricardo
title Database partitioning strategies for social network data
title_short Database partitioning strategies for social network data
title_full Database partitioning strategies for social network data
title_fullStr Database partitioning strategies for social network data
title_full_unstemmed Database partitioning strategies for social network data
title_sort database partitioning strategies for social network data
publisher Massachusetts Institute of Technology
publishDate 2013
url http://hdl.handle.net/1721.1/77449
work_keys_str_mv AT mollthomaeoscarricardo databasepartitioningstrategiesforsocialnetworkdata
_version_ 1719041553890017280