Summary: | Nowadays, big data analytics has been widely applied in addressing the growing cybercrime threats. However, energy consumption is explosive increasing with the fast growth of big data processing in anti-cybercrime. In this paper, an energy-efficient framework for big data applications is proposed to reduce energy consumption while satisfying deadline constrains. First, the problem of energy-efficient tasks scheduling of a single Spark job is modeled as an integer program. We design an energy-efficient tasks scheduling algorithm to minimize the energy consumption for big data application in Spark. To avoid service-level agreement violations for execution time, we propose an optimal task scheduling algorithm with deadline constrains by tradingoff execution time and energy consumption. Experiments on a Spark cluster are performed to determine the energy consumption and execution time for several workloads from the HiBench benchmark suite. Our algorithms consume less energy on average than FIFO and FAIR under deadlines. The optimal algorithm is able to find near optimal tasks schedules to trade off energy consumed and response time benefit in small shuffle partitions.
|