Hybrid-Grained Dynamic Load Balanced GEMM on NUMA Architectures

The Basic Linear Algebra Subprograms (BLAS) is a fundamental numerical software and GEneral Matrix Multiply (GEMM) is the most important computational kernel routine in the BLAS library. On multi-core and many-core processors, the whole workload of GEMM is partitioned and scheduled to multiple threa...

Full description

Bibliographic Details
Main Authors: Xing Su, Fei Lei
Format: Article
Language:English
Published: MDPI AG 2018-11-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/7/12/359