Towards scalar synchronization in SIMT architectures
An important class of compute accelerators are graphics processing units (GPUs). Popular programming models for non-graphics computation on GPUs, such as CUDA and OpenCL, provide an abstraction of many parallel scalar threads. Contemporary GPU hardware groups 32 to 64 scalar threads as a single warp...
Main Author: | Ramamurthy, Arun |
---|---|
Language: | English |
Published: |
University of British Columbia
2011
|
Online Access: | http://hdl.handle.net/2429/37732 |
Similar Items
-
Towards scalar synchronization in SIMT architectures
by: Ramamurthy, Arun
Published: (2011) -
Towards scalar synchronization in SIMT architectures
by: Ramamurthy, Arun
Published: (2011) -
A simple method for rejection sampling efficiency improvement on SIMT architectures
by: Ridley, Gavin, et al.
Published: (2021) -
A simple method for rejection sampling efficiency improvement on SIMT architectures
by: Ridley, Gavin, et al.
Published: (2021) -
Bus System for Coresonic SIMT DSP
by: Svensk, Gustav
Published: (2016)