Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation

Bibliographic Details
Main Author:	Gadiraju, Krishna Karthik
Language:	English
Published:	University of Cincinnati / OhioLINK 2014
Subjects:	Computer Science Hive Hadoop benchmarking big data SQL queries
Online Access:	http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914

id	ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1409065914
record_format	oai_dc
spelling	ndltd-OhioLink-oai-etd.ohiolink.edu-ucin14090659142021-08-03T06:27:12Z Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation Gadiraju, Krishna Karthik Computer Science Hive Hadoop benchmarking big data SQL queries Many organizations rely on relational database platforms for OLAP-style querying (aggregation and filtering) for small to medium size applications. We investigate the impact of scaling up the data sizes for such queries. We intend to illustrate what kind of performance results an organization could expect should they migrate current applications to big data environments. This thesis benchmarks the performance of Hive, a parallel data warehouse platform that is a part of the Hadoop software stack. We set up a 4-node Hadoop cluster using Hortonworks HDP 1.3.2. We use the data generator provided by the TPC-DS benchmark to generate data of different scales. We use a representative query provided in the TPC-DS query set and run the SQL and Hive Query Language (HiveQL) versions of the same query on a relational database installation (MySQL) and on the Hive cluster. An analysis of the results shows that for all the dataset sizes used, Hive is faster than MySQL when executing the query. Hive loads the large datasets faster than MySQL, while it is marginally slower than MySQL when loading the smaller datasets. 2014-10-13 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection	NDLTD
language	English
sources	NDLTD
topic	Computer Science Hive Hadoop benchmarking big data SQL queries
spellingShingle	Computer Science Hive Hadoop benchmarking big data SQL queries Gadiraju, Krishna Karthik Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
author	Gadiraju, Krishna Karthik
author_facet	Gadiraju, Krishna Karthik
author_sort	Gadiraju, Krishna Karthik
title	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
title_short	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
title_full	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
title_fullStr	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
title_full_unstemmed	Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
title_sort	benchmarking performance for migrating a relational application to a parallel implementation
publisher	University of Cincinnati / OhioLINK
publishDate	2014
url	http://rave.ohiolink.edu/etdc/view?acc_num=ucin1409065914
work_keys_str_mv	AT gadirajukrishnakarthik benchmarkingperformanceformigratingarelationalapplicationtoaparallelimplementation
_version_	1719437082870415360

Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation

Similar Items