The FAST-HEP toolset: Using YAML to make tables out of trees

The Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implemen...

Full description

Bibliographic Details
Main Authors: Krikler Benjamin Edward, Davignon Olivier, Kreczko Lukasz, Linacre Jacob
Format: Article
Language:English
Published: EDP Sciences 2020-01-01
Series:EPJ Web of Conferences
Online Access:https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf
id doaj-2f3a7d15e1014a1da5bf9bc49cffecf4
record_format Article
spelling doaj-2f3a7d15e1014a1da5bf9bc49cffecf42021-08-02T22:58:27ZengEDP SciencesEPJ Web of Conferences2100-014X2020-01-012450601610.1051/epjconf/202024506016epjconf_chep2020_06016The FAST-HEP toolset: Using YAML to make tables out of treesKrikler Benjamin Edward0Davignon Olivier1Kreczko Lukasz2Linacre Jacob3University of BristolLaboratoire Leprince-Ringuet, CNRS/IN2P3University of BristolRutherford Appleton LaboratoryThe Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implementation of an analysis using almost exclusively YAML files. Serving as an analysis description language (ADL), this toolset builds on top of the evolving technologies from the Scikit-HEP and IRIS-HEP projects as well as industry-standard libraries such as Pandas and Matplotlib. Data processing starts with event-level data (the trees) and can proceed by adding variables, selecting events, performing complex user-defined operations and binning data, as defined in the YAML description. The resulting outputs (the tables) are stored as Pandas dataframes which can be programmatically manipulated and converted to plots or inputs for fitting frameworks. No longer just a proof-of-principle, these tools are now being used in CMS analyses, the LUX-ZEPLIN experiment, and by students on several other experiments. In this talk we will showcase these tools through examples, highlighting how they address the different experiments’ needs, and compare them to other similar approaches.https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf
collection DOAJ
language English
format Article
sources DOAJ
author Krikler Benjamin Edward
Davignon Olivier
Kreczko Lukasz
Linacre Jacob
spellingShingle Krikler Benjamin Edward
Davignon Olivier
Kreczko Lukasz
Linacre Jacob
The FAST-HEP toolset: Using YAML to make tables out of trees
EPJ Web of Conferences
author_facet Krikler Benjamin Edward
Davignon Olivier
Kreczko Lukasz
Linacre Jacob
author_sort Krikler Benjamin Edward
title The FAST-HEP toolset: Using YAML to make tables out of trees
title_short The FAST-HEP toolset: Using YAML to make tables out of trees
title_full The FAST-HEP toolset: Using YAML to make tables out of trees
title_fullStr The FAST-HEP toolset: Using YAML to make tables out of trees
title_full_unstemmed The FAST-HEP toolset: Using YAML to make tables out of trees
title_sort fast-hep toolset: using yaml to make tables out of trees
publisher EDP Sciences
series EPJ Web of Conferences
issn 2100-014X
publishDate 2020-01-01
description The Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implementation of an analysis using almost exclusively YAML files. Serving as an analysis description language (ADL), this toolset builds on top of the evolving technologies from the Scikit-HEP and IRIS-HEP projects as well as industry-standard libraries such as Pandas and Matplotlib. Data processing starts with event-level data (the trees) and can proceed by adding variables, selecting events, performing complex user-defined operations and binning data, as defined in the YAML description. The resulting outputs (the tables) are stored as Pandas dataframes which can be programmatically manipulated and converted to plots or inputs for fitting frameworks. No longer just a proof-of-principle, these tools are now being used in CMS analyses, the LUX-ZEPLIN experiment, and by students on several other experiments. In this talk we will showcase these tools through examples, highlighting how they address the different experiments’ needs, and compare them to other similar approaches.
url https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf
work_keys_str_mv AT kriklerbenjaminedward thefastheptoolsetusingyamltomaketablesoutoftrees
AT davignonolivier thefastheptoolsetusingyamltomaketablesoutoftrees
AT kreczkolukasz thefastheptoolsetusingyamltomaketablesoutoftrees
AT linacrejacob thefastheptoolsetusingyamltomaketablesoutoftrees
AT kriklerbenjaminedward fastheptoolsetusingyamltomaketablesoutoftrees
AT davignonolivier fastheptoolsetusingyamltomaketablesoutoftrees
AT kreczkolukasz fastheptoolsetusingyamltomaketablesoutoftrees
AT linacrejacob fastheptoolsetusingyamltomaketablesoutoftrees
_version_ 1721225948796289024