Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool

The CMS experiment has an HTCondor Global Pool, composed of more than 200K CPU cores available for Monte Carlo production and the analysis of da.The submission of user jobs to this pool is handled by either CRAB, the standard workflow management tool used by CMS users to submit analysis jobs requiri...

Full description

Bibliographic Details
Main Authors: Bockelman Brian Paul, Fajardo Hernandez Edgar, Davila Foyo Diego, Hurtado Anampa Kenyi, Aftab Khan Farrukh, Larson Krista, Letts James, Mascheroni Marco, Mason David, Perez-Calero Yzquierdo Antonio, Trendafilovz Ivanov Todor
Format: Article
Language:English
Published: EDP Sciences 2019-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03004.pdf
id doaj-70bac3de63284d6f9a4f571398c65437
record_format Article
spelling doaj-70bac3de63284d6f9a4f571398c654372021-08-02T03:51:51ZengEDP SciencesEPJ Web of Conferences2100-014X2019-01-012140300410.1051/epjconf/201921403004epjconf_chep2018_03004Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global PoolBockelman Brian PaulFajardo Hernandez EdgarDavila Foyo DiegoHurtado Anampa KenyiAftab Khan FarrukhLarson KristaLetts JamesMascheroni MarcoMason DavidPerez-Calero Yzquierdo AntonioTrendafilovz Ivanov TodorThe CMS experiment has an HTCondor Global Pool, composed of more than 200K CPU cores available for Monte Carlo production and the analysis of da.The submission of user jobs to this pool is handled by either CRAB, the standard workflow management tool used by CMS users to submit analysis jobs requiring event processing of large amounts of data, or by CMS Connect, a service focused on final stage condor-like analysis jobs and applications that already have a workflow job manager in place. The latest scenario canbring cases in which workflows need further adjustments in order to efficiently work in a globally distributed pool of resources. For instance, the generation of matrix elements for high energy physics processes via Madgraph5_aMC@NLO and the usage of tools not (yet) fully supported by the CMS software, such as Ten-sorFlow with GPUsupport, are tasks with particular requirements. A special adaption, either at the pool factory level (advertising GPU resources) or at the execute level (e.g: to handle special parameters that describe certain needs for the remote execute nodes during submission) is needed in order to adequately work in the CMS global pool. This contribution describes the challenges and efforts performed towards adaptingsuch workflows so they can properly profit from the Global Pool via CMS Connect.https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03004.pdf
collection DOAJ
language English
format Article
sources DOAJ
author Bockelman Brian Paul
Fajardo Hernandez Edgar
Davila Foyo Diego
Hurtado Anampa Kenyi
Aftab Khan Farrukh
Larson Krista
Letts James
Mascheroni Marco
Mason David
Perez-Calero Yzquierdo Antonio
Trendafilovz Ivanov Todor
spellingShingle Bockelman Brian Paul
Fajardo Hernandez Edgar
Davila Foyo Diego
Hurtado Anampa Kenyi
Aftab Khan Farrukh
Larson Krista
Letts James
Mascheroni Marco
Mason David
Perez-Calero Yzquierdo Antonio
Trendafilovz Ivanov Todor
Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
EPJ Web of Conferences
author_facet Bockelman Brian Paul
Fajardo Hernandez Edgar
Davila Foyo Diego
Hurtado Anampa Kenyi
Aftab Khan Farrukh
Larson Krista
Letts James
Mascheroni Marco
Mason David
Perez-Calero Yzquierdo Antonio
Trendafilovz Ivanov Todor
author_sort Bockelman Brian Paul
title Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
title_short Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
title_full Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
title_fullStr Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
title_full_unstemmed Producing Madgraph5_aMC@NLO gridpacks and using TensorFlow GPU resources in the CMS HTCondor Global Pool
title_sort producing madgraph5_amc@nlo gridpacks and using tensorflow gpu resources in the cms htcondor global pool
publisher EDP Sciences
series EPJ Web of Conferences
issn 2100-014X
publishDate 2019-01-01
description The CMS experiment has an HTCondor Global Pool, composed of more than 200K CPU cores available for Monte Carlo production and the analysis of da.The submission of user jobs to this pool is handled by either CRAB, the standard workflow management tool used by CMS users to submit analysis jobs requiring event processing of large amounts of data, or by CMS Connect, a service focused on final stage condor-like analysis jobs and applications that already have a workflow job manager in place. The latest scenario canbring cases in which workflows need further adjustments in order to efficiently work in a globally distributed pool of resources. For instance, the generation of matrix elements for high energy physics processes via Madgraph5_aMC@NLO and the usage of tools not (yet) fully supported by the CMS software, such as Ten-sorFlow with GPUsupport, are tasks with particular requirements. A special adaption, either at the pool factory level (advertising GPU resources) or at the execute level (e.g: to handle special parameters that describe certain needs for the remote execute nodes during submission) is needed in order to adequately work in the CMS global pool. This contribution describes the challenges and efforts performed towards adaptingsuch workflows so they can properly profit from the Global Pool via CMS Connect.
url https://www.epj-conferences.org/articles/epjconf/pdf/2019/19/epjconf_chep2018_03004.pdf
work_keys_str_mv AT bockelmanbrianpaul producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT fajardohernandezedgar producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT davilafoyodiego producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT hurtadoanampakenyi producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT aftabkhanfarrukh producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT larsonkrista producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT lettsjames producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT mascheronimarco producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT masondavid producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT perezcaleroyzquierdoantonio producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
AT trendafilovzivanovtodor producingmadgraph5amcnlogridpacksandusingtensorflowgpuresourcesinthecmshtcondorglobalpool
_version_ 1721243044639932416