The FAST-HEP toolset: Using YAML to make tables out of trees
The Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implemen...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2020-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf |
id |
doaj-2f3a7d15e1014a1da5bf9bc49cffecf4 |
---|---|
record_format |
Article |
spelling |
doaj-2f3a7d15e1014a1da5bf9bc49cffecf42021-08-02T22:58:27ZengEDP SciencesEPJ Web of Conferences2100-014X2020-01-012450601610.1051/epjconf/202024506016epjconf_chep2020_06016The FAST-HEP toolset: Using YAML to make tables out of treesKrikler Benjamin Edward0Davignon Olivier1Kreczko Lukasz2Linacre Jacob3University of BristolLaboratoire Leprince-Ringuet, CNRS/IN2P3University of BristolRutherford Appleton LaboratoryThe Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implementation of an analysis using almost exclusively YAML files. Serving as an analysis description language (ADL), this toolset builds on top of the evolving technologies from the Scikit-HEP and IRIS-HEP projects as well as industry-standard libraries such as Pandas and Matplotlib. Data processing starts with event-level data (the trees) and can proceed by adding variables, selecting events, performing complex user-defined operations and binning data, as defined in the YAML description. The resulting outputs (the tables) are stored as Pandas dataframes which can be programmatically manipulated and converted to plots or inputs for fitting frameworks. No longer just a proof-of-principle, these tools are now being used in CMS analyses, the LUX-ZEPLIN experiment, and by students on several other experiments. In this talk we will showcase these tools through examples, highlighting how they address the different experiments’ needs, and compare them to other similar approaches.https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Krikler Benjamin Edward Davignon Olivier Kreczko Lukasz Linacre Jacob |
spellingShingle |
Krikler Benjamin Edward Davignon Olivier Kreczko Lukasz Linacre Jacob The FAST-HEP toolset: Using YAML to make tables out of trees EPJ Web of Conferences |
author_facet |
Krikler Benjamin Edward Davignon Olivier Kreczko Lukasz Linacre Jacob |
author_sort |
Krikler Benjamin Edward |
title |
The FAST-HEP toolset: Using YAML to make tables out of trees |
title_short |
The FAST-HEP toolset: Using YAML to make tables out of trees |
title_full |
The FAST-HEP toolset: Using YAML to make tables out of trees |
title_fullStr |
The FAST-HEP toolset: Using YAML to make tables out of trees |
title_full_unstemmed |
The FAST-HEP toolset: Using YAML to make tables out of trees |
title_sort |
fast-hep toolset: using yaml to make tables out of trees |
publisher |
EDP Sciences |
series |
EPJ Web of Conferences |
issn |
2100-014X |
publishDate |
2020-01-01 |
description |
The Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implementation of an analysis using almost exclusively YAML files. Serving as an analysis description language (ADL), this toolset builds on top of the evolving technologies from the Scikit-HEP and IRIS-HEP projects as well as industry-standard libraries such as Pandas and Matplotlib. Data processing starts with event-level data (the trees) and can proceed by adding variables, selecting events, performing complex user-defined operations and binning data, as defined in the YAML description. The resulting outputs (the tables) are stored as Pandas dataframes which can be programmatically manipulated and converted to plots or inputs for fitting frameworks. No longer just a proof-of-principle, these tools are now being used in CMS analyses, the LUX-ZEPLIN experiment, and by students on several other experiments. In this talk we will showcase these tools through examples, highlighting how they address the different experiments’ needs, and compare them to other similar approaches. |
url |
https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_06016.pdf |
work_keys_str_mv |
AT kriklerbenjaminedward thefastheptoolsetusingyamltomaketablesoutoftrees AT davignonolivier thefastheptoolsetusingyamltomaketablesoutoftrees AT kreczkolukasz thefastheptoolsetusingyamltomaketablesoutoftrees AT linacrejacob thefastheptoolsetusingyamltomaketablesoutoftrees AT kriklerbenjaminedward fastheptoolsetusingyamltomaketablesoutoftrees AT davignonolivier fastheptoolsetusingyamltomaketablesoutoftrees AT kreczkolukasz fastheptoolsetusingyamltomaketablesoutoftrees AT linacrejacob fastheptoolsetusingyamltomaketablesoutoftrees |
_version_ |
1721225948796289024 |