Provision and use of GPU resources for distributed workloads via the Grid

The Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such...

Full description

Bibliographic Details
Main Authors: Traynor Daniel, Froy Terry
Format: Article
Language:English
Published: EDP Sciences 2020-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_03002.pdf
id doaj-b104a6dc9a93473992ce52c7052c4064
record_format Article
spelling doaj-b104a6dc9a93473992ce52c7052c40642021-08-02T22:57:35ZengEDP SciencesEPJ Web of Conferences2100-014X2020-01-012450300210.1051/epjconf/202024503002epjconf_chep2020_03002Provision and use of GPU resources for distributed workloads via the GridTraynor Daniel0Froy Terry1Queen Mary University of LondonQueen Mary University of LondonThe Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such as OpenCL and CUDA. However, more recently their power in accelerating machine learning, using libraries such as TensorFlow and Coffee, has come to the fore and the demand for GPU resources has increased. Significant effort is being spent in high energy physics to investigate and use machine learning to enhance the analysis of data. GPUs may also provide part of the solution to the compute challenge of the High Luminosity LHC. The motivation for providing GPU resources via the Grid is presented. The installation and configuration of the SLURM batch system together with Compute Elements (CREAM and ARC) for use with GPUs is shown. Real world use cases are presented and the success and issues discovered are discussed.https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_03002.pdf
collection DOAJ
language English
format Article
sources DOAJ
author Traynor Daniel
Froy Terry
spellingShingle Traynor Daniel
Froy Terry
Provision and use of GPU resources for distributed workloads via the Grid
EPJ Web of Conferences
author_facet Traynor Daniel
Froy Terry
author_sort Traynor Daniel
title Provision and use of GPU resources for distributed workloads via the Grid
title_short Provision and use of GPU resources for distributed workloads via the Grid
title_full Provision and use of GPU resources for distributed workloads via the Grid
title_fullStr Provision and use of GPU resources for distributed workloads via the Grid
title_full_unstemmed Provision and use of GPU resources for distributed workloads via the Grid
title_sort provision and use of gpu resources for distributed workloads via the grid
publisher EDP Sciences
series EPJ Web of Conferences
issn 2100-014X
publishDate 2020-01-01
description The Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such as OpenCL and CUDA. However, more recently their power in accelerating machine learning, using libraries such as TensorFlow and Coffee, has come to the fore and the demand for GPU resources has increased. Significant effort is being spent in high energy physics to investigate and use machine learning to enhance the analysis of data. GPUs may also provide part of the solution to the compute challenge of the High Luminosity LHC. The motivation for providing GPU resources via the Grid is presented. The installation and configuration of the SLURM batch system together with Compute Elements (CREAM and ARC) for use with GPUs is shown. Real world use cases are presented and the success and issues discovered are discussed.
url https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_03002.pdf
work_keys_str_mv AT traynordaniel provisionanduseofgpuresourcesfordistributedworkloadsviathegrid
AT froyterry provisionanduseofgpuresourcesfordistributedworkloadsviathegrid
_version_ 1721226004294270976