Summary: | Power, performance, and cost dictate the procurement and operation of high-performance computing (HPC) systems. These systems use graphics processing units (GPUs) for performance boost. In order to identify inexpensive-to-acquire and inexpensive-to-operate systems, it is important to do a systematic comparison of such systems with respect to power, performance and energy characteristics with the end use applications. Additionally, the chosen systems must often achieve performance objectives without exceeding their respective power budgets, a task that is usually borne by a software-based power management system. Accurately predicting the power consumption of an application at different DVFS levels (or more generally, different processor configurations) is paramount for the efficient functioning of such a management system.
This thesis intends to apply the latest in the state-of-the-art in green computing research to optimize the total cost of acquisition and ownership of heterogeneous computing systems. To achieve this we take a two-fold approach. First, we explore the issue of greener device selection by characterizing device power and performance. For this, we explore previously untapped opportunities arising from a special type of graphics processor --- the low-power integrated GPU --- which is commonly available in commodity systems. We compare the greenness (power, energy, and energy-delay product $rightarrow$ EDP) of the integrated GPU against a CPU running at different frequencies for the specific application domain of scientific visualization. Second, we explore the problem of predicting the power consumption of a GPU at different DVFS states via machine-learning techniques. Specifically, we perform statistically rigorous experiments to uncover the strengths and weaknesses of eight different machine-learning techniques (namely, ZeroR, simple linear regression, KNN, bagging, random forest, SMO regression, decision tree, and neural networks) in predicting GPU power consumption at different frequencies. Our study shows that a support vector machine-aided regression model (i.e., SMO regression) achieves the highest accuracy with a mean absolute error (MAE) of 4.5%. We also observe that the random forest method produces the most consistent results with a reasonable overall MAE of 7.4%. Our results also show that different models operate best in distinct regions of the application space. We, therefore, develop a novel, ensemble technique drawing the best characteristics of the various algorithms, which reduces the MAE to 3.5% and maximum error to 11% from 20% for SMO regression. === MS
|