Auto-tunable GPU BLAS
In this paper, we present our implementation of an Auto tuning system, written in C++, which incorporate the use of OpenCL kernels. We deploy this approach on different GPU architectures, evaluating the performance of the approach. Our main focus is to easily generate tuned code, that would otherwis...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap
2012
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-18411 |