A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

Abstract This article proposes a new parallel performance model for different workloads of Spark Big Data applications running on Hadoop clusters. The proposed model can predict the runtime for generic workloads as a function of the number of executors, without necessarily knowing how the algorithms...

Full description

Bibliographic Details
Main Authors:	N. Ahmed, Andre L. C. Barczak, Mohammad A. Rashid, Teo Susnjak
Format:	Article
Language:	English
Published:	SpringerOpen 2021-08-01
Series:	Journal of Big Data
Subjects:	Big Data Performance prediction System configuration HiBench Spark
Online Access:	https://doi.org/10.1186/s40537-021-00499-7

Internet

https://doi.org/10.1186/s40537-021-00499-7

A parallelization model for performance characterization of Spark Big Data jobs on Hadoop clusters

Internet

Similar Items