Hybrid-Grained Dynamic Load Balanced GEMM on NUMA Architectures
The Basic Linear Algebra Subprograms (BLAS) is a fundamental numerical software and GEneral Matrix Multiply (GEMM) is the most important computational kernel routine in the BLAS library. On multi-core and many-core processors, the whole workload of GEMM is partitioned and scheduled to multiple threa...
Main Authors: | Xing Su, Fei Lei |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-11-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/7/12/359 |
Similar Items
-
Una piccola collezione di gemme: le gemme con i «segni celesti» di Lorenzo Lotto
by: Francesco De Carolis
Published: (2013-10-01) -
Accelerating R with high performance linear algebra libraries
by: Bogdan Oancea, et al.
Published: (2015-09-01) -
Accelerating Dense Linear Algebra for GPUs, Multicores and Hybrid Architectures: an Autotuned and Algorithmic Approach
by: Nath, Rajib Kumar
Published: (2010) -
An Approximate GEMM Unit for Energy-Efficient Object Detection
by: Ratko Pilipović, et al.
Published: (2021-06-01) -
NUMA-Aware DGEMM Based on 64-Bit ARMv8 Multicore Processors Architecture
by: Wei Zhang, et al.
Published: (2021-08-01)