DGX-A100 Face to Face DGX-2—Performance, Power and Thermal Behavior Evaluation

Nvidia is a leading producer of GPUs for high-performance computing and artificial intelligence, bringing top performance and energy-efficiency. We present performance, power consumption, and thermal behavior analysis of the new Nvidia DGX-A100 server equipped with eight A100 Ampere microarchitectur...

Full description

Bibliographic Details
Main Authors: Matej Špeťko, Ondřej Vysocký, Branislav Jansík, Lubomír Říha
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/14/2/376
Description
Summary:Nvidia is a leading producer of GPUs for high-performance computing and artificial intelligence, bringing top performance and energy-efficiency. We present performance, power consumption, and thermal behavior analysis of the new Nvidia DGX-A100 server equipped with eight A100 Ampere microarchitecture GPUs. The results are compared against the previous generation of the server, Nvidia DGX-2, based on Tesla V100 GPUs. We developed a synthetic benchmark to measure the raw performance of floating-point computing units including Tensor Cores. Furthermore, thermal stability was investigated. In addition, Dynamic Frequency and Voltage Scaling (DVFS) analysis was performed to determine the best energy-efficient configuration of the GPUs executing workloads of various arithmetical intensities. Under the energy-optimal configuration the A100 GPU reaches efficiency of 51 GFLOPS/W for double-precision workload and 91 GFLOPS/W for tensor core double precision workload, which makes the A100 the most energy-efficient server accelerator for scientific simulations in the market.
ISSN:1996-1073