Best Trade-Off Point Method for Efficient Resource Provisioning in Spark

Considering the recent exponential growth in the amount of information processed in Big Data, the high energy consumed by data processing engines in datacenters has become a major issue, underlining the need for efficient resource allocation for more energy-efficient computing. We previously propose...

Full description

Bibliographic Details
Main Author:	Peter P. Nghiem
Format:	Article
Language:	English
Published:	MDPI AG 2018-11-01
Series:	Algorithms
Subjects:	Apache Spark Hadoop MapReduce YARN algorithm for best trade-off point optimization resource provisioning performance efficiency energy efficiency elbow curve
Online Access:	https://www.mdpi.com/1999-4893/11/12/190

id	doaj-c8afa9777c6c4180ade85387243d5f4d
record_format	Article
spelling	doaj-c8afa9777c6c4180ade85387243d5f4d2020-11-24T21:35:10ZengMDPI AGAlgorithms1999-48932018-11-01111219010.3390/a11120190a11120190Best Trade-Off Point Method for Efficient Resource Provisioning in SparkPeter P. Nghiem0Department of Computer Engineering, School of Engineering, Santa Clara University, 500 El Camino Real, Santa Clara, CA 95053, USAConsidering the recent exponential growth in the amount of information processed in Big Data, the high energy consumed by data processing engines in datacenters has become a major issue, underlining the need for efficient resource allocation for more energy-efficient computing. We previously proposed the Best Trade-off Point (BToP) method, which provides a general approach and techniques based on an algorithm with mathematical formulas to find the best trade-off point on an elbow curve of performance vs. resources for efficient resource provisioning in Hadoop MapReduce. The BToP method is expected to work for any application or system which relies on a trade-off elbow curve, non-inverted or inverted, for making good decisions. In this paper, we apply the BToP method to the emerging cluster computing framework, Apache Spark, and show that its performance and energy consumption are better than Spark with its built-in dynamic resource allocation enabled. Our Spark-Bench tests confirm the effectiveness of using the BToP method with Spark to determine the optimal number of executors for any workload in production environments where job profiling for behavioral replication will lead to the most efficient resource provisioning.https://www.mdpi.com/1999-4893/11/12/190Apache SparkHadoop MapReduceYARNalgorithm for best trade-off pointoptimizationresource provisioningperformance efficiencyenergy efficiencyelbow curve
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Peter P. Nghiem
spellingShingle	Peter P. Nghiem Best Trade-Off Point Method for Efficient Resource Provisioning in Spark Algorithms Apache Spark Hadoop MapReduce YARN algorithm for best trade-off point optimization resource provisioning performance efficiency energy efficiency elbow curve
author_facet	Peter P. Nghiem
author_sort	Peter P. Nghiem
title	Best Trade-Off Point Method for Efficient Resource Provisioning in Spark
title_short	Best Trade-Off Point Method for Efficient Resource Provisioning in Spark
title_full	Best Trade-Off Point Method for Efficient Resource Provisioning in Spark
title_fullStr	Best Trade-Off Point Method for Efficient Resource Provisioning in Spark
title_full_unstemmed	Best Trade-Off Point Method for Efficient Resource Provisioning in Spark
title_sort	best trade-off point method for efficient resource provisioning in spark
publisher	MDPI AG
series	Algorithms
issn	1999-4893
publishDate	2018-11-01
description	Considering the recent exponential growth in the amount of information processed in Big Data, the high energy consumed by data processing engines in datacenters has become a major issue, underlining the need for efficient resource allocation for more energy-efficient computing. We previously proposed the Best Trade-off Point (BToP) method, which provides a general approach and techniques based on an algorithm with mathematical formulas to find the best trade-off point on an elbow curve of performance vs. resources for efficient resource provisioning in Hadoop MapReduce. The BToP method is expected to work for any application or system which relies on a trade-off elbow curve, non-inverted or inverted, for making good decisions. In this paper, we apply the BToP method to the emerging cluster computing framework, Apache Spark, and show that its performance and energy consumption are better than Spark with its built-in dynamic resource allocation enabled. Our Spark-Bench tests confirm the effectiveness of using the BToP method with Spark to determine the optimal number of executors for any workload in production environments where job profiling for behavioral replication will lead to the most efficient resource provisioning.
topic	Apache Spark Hadoop MapReduce YARN algorithm for best trade-off point optimization resource provisioning performance efficiency energy efficiency elbow curve
url	https://www.mdpi.com/1999-4893/11/12/190
work_keys_str_mv	AT peterpnghiem besttradeoffpointmethodforefficientresourceprovisioninginspark
_version_	1725946290962956288

Best Trade-Off Point Method for Efficient Resource Provisioning in Spark

Similar Items