Summary: | 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 98 === Recently graphics processing units (GPUs) have become an important high-performance computing platform. In many application fields, using GPU to accelerate computation has proven to be feasible. However, many independent tasks do not fully utilize the GPU resources, suggesting scheduling independent tasks is an important issue worth to be studied for platform with GPUs. This thesis proposes a two-stage task scheduling scheme on CUDA systems. In the first stage, the task allocation problem is mapped to the bin packing problem and the First-Fit Decreasing algorithm is selected to solve it. In the second stage, we use the asynchronous transfers and overlap transfers with computation to execute tasks. Base on the experimental results, we show our method is on average 1.57 times faster than the sequential method and 1.2 times faster than the merge kernel method proposed [9] by Guevara et al.
|