Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
Main Author: | |
---|---|
Language: | English |
Published: |
University of Cincinnati / OhioLINK
2014
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1409065914 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-ucin14090659142021-08-03T06:27:12Z Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation Gadiraju, Krishna Karthik Computer Science Hive Hadoop benchmarking big data SQL queries Many organizations rely on relational database platforms for OLAP-style querying (aggregation and filtering) for small to medium size applications. We investigate the impact of scaling up the data sizes for such queries. We intend to illustrate what kind of performance results an organization could expect should they migrate current applications to big data environments. This thesis benchmarks the performance of Hive, a parallel data warehouse platform that is a part of the Hadoop software stack. We set up a 4-node Hadoop cluster using Hortonworks HDP 1.3.2. We use the data generator provided by the TPC-DS benchmark to generate data of different scales. We use a representative query provided in the TPC-DS query set and run the SQL and Hive Query Language (HiveQL) versions of the same query on a relational database installation (MySQL) and on the Hive cluster. An analysis of the results shows that for all the dataset sizes used, Hive is faster than MySQL when executing the query. Hive loads the large datasets faster than MySQL, while it is marginally slower than MySQL when loading the smaller datasets. 2014-10-13 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Computer Science Hive Hadoop benchmarking big data SQL queries |
spellingShingle |
Computer Science Hive Hadoop benchmarking big data SQL queries Gadiraju, Krishna Karthik Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
author |
Gadiraju, Krishna Karthik |
author_facet |
Gadiraju, Krishna Karthik |
author_sort |
Gadiraju, Krishna Karthik |
title |
Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
title_short |
Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
title_full |
Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
title_fullStr |
Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
title_full_unstemmed |
Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation |
title_sort |
benchmarking performance for migrating a relational application to a parallel implementation |
publisher |
University of Cincinnati / OhioLINK |
publishDate |
2014 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 |
work_keys_str_mv |
AT gadirajukrishnakarthik benchmarkingperformanceformigratingarelationalapplicationtoaparallelimplementation |
_version_ |
1719437082870415360 |