Increasing SpMV Energy Efficiency Through Compression : A study of how format, input and platform properties affect the energy efficiency of Compressed Sparse eXtended

This work is a continuation and augmentation of previous energy studies ofCompressed Sparse eXtended (CSX), a framework for efficiently executing SparseMatrix-Vector Multiplication (SpMV).CSX was developed by the CSLab at the National Technical University of Athens(NTUA), and utilizes compression to...

Full description

Bibliographic Details
Main Author: Simonsen, Lars-Ivar H
Format: Others
Language:English
Published: Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap 2013
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-22977
Description
Summary:This work is a continuation and augmentation of previous energy studies ofCompressed Sparse eXtended (CSX), a framework for efficiently executing SparseMatrix-Vector Multiplication (SpMV).CSX was developed by the CSLab at the National Technical University of Athens(NTUA), and utilizes compression to overcome a significant memory bottleneckinherent in SpMV, thus increasing performance and energy efficiency of itsexecution.SpMV is notorious within scientific computing for its low performance. However,the problem is unavoidable, as SpMV can be found within several scientificapplications. In this work, CSX is tested as the SpMV kernel in a frameworkimplementing the Conjugate Gradient Method (CG), an iterative algorithm forsolving specific linear algebra problems. CSX is also evaluated againstCompressed Sparse Row (CSR), a storage scheme widely used when executing SpMV.This work augments existing studies by evaluating properties in the formatsthemselves, in the matrices used as input and in the target platform to gainknowledge on how to maximize the benefits of CSX, as well as for what casesCSX does not prove beneficial. The work also compares the performance ofSpMV-execution on a stand-alone server known as the CARD-server to similarexecution on the Vilje supercomputer. This is done to evaluate how thedifferences between these two machines affect the results.Based on the results, it is shown that CSX should be used for matrices largerthan the Last Level Cache (LLC) of the target machine and for matrices with highdegrees of clustering in their values. The best energy efficiency trade-offs arefound at eight threads on dual socket configurations, and this is shown to berelated to the amount of physical cores per CPU. Similarly, frequencythrottling is shown to increase the energy efficiency of the execution only athigh numbers of threads and at the cost of performance.Overall, CSX is shown to obtain higher energy efficiency than CSR forSpMV-execution, given a suitable problem and run configuration. Thus, it isconfirmed that CSX can be used to decrease the energy consumption of SpMVapplications.