Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System
The ATLAS experiment will undergo a major upgrade to take advantage of the new conditions provided by the upgraded High-Luminosity LHC. The Trigger and Data Acquisition system (TDAQ) will record data at unprecedented rates: the detectors will be read out at 1 MHz generating around 5 TB/s of data. Th...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2021-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_04014.pdf |
id |
doaj-80a8fc2f56974f13994245227665be50 |
---|---|
record_format |
Article |
spelling |
doaj-80a8fc2f56974f13994245227665be502021-08-26T09:27:32ZengEDP SciencesEPJ Web of Conferences2100-014X2021-01-012510401410.1051/epjconf/202125104014epjconf_chep2021_04014Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ SystemAbed Abud AdamBonaventura MatiasFarina EdoardoLe Goff Fabrice0European Laboratory for Particle Physics (CERN)The ATLAS experiment will undergo a major upgrade to take advantage of the new conditions provided by the upgraded High-Luminosity LHC. The Trigger and Data Acquisition system (TDAQ) will record data at unprecedented rates: the detectors will be read out at 1 MHz generating around 5 TB/s of data. The Dataflow system (DF), component of TDAQ, introduces a novel design: readout data are buffered on persistent storage while the event filtering system analyses them to select 10000 events per second for a total recorded throughput of around 60 GB/s. This approach allows for decoupling the detector activity from the event selection process. New challenges then arise for DF: design and implement a distributed, reliable, persistent storage system supporting several TB/s of aggregated throughput while providing tens of PB of capacity. In this paper we first describe some of the challenges that DF is facing: data safety with persistent storage limitations, indexing of data at high-granularity in a highly-distributed system, and high-performance management of storage capacity. Then the ongoing R&D to address each of the them is presented and the performance achieved with a working prototype is shown.https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_04014.pdf |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Abed Abud Adam Bonaventura Matias Farina Edoardo Le Goff Fabrice |
spellingShingle |
Abed Abud Adam Bonaventura Matias Farina Edoardo Le Goff Fabrice Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System EPJ Web of Conferences |
author_facet |
Abed Abud Adam Bonaventura Matias Farina Edoardo Le Goff Fabrice |
author_sort |
Abed Abud Adam |
title |
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System |
title_short |
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System |
title_full |
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System |
title_fullStr |
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System |
title_full_unstemmed |
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System |
title_sort |
design of a resilient, high-throughput, persistent storage system for the atlas phase-ii daq system |
publisher |
EDP Sciences |
series |
EPJ Web of Conferences |
issn |
2100-014X |
publishDate |
2021-01-01 |
description |
The ATLAS experiment will undergo a major upgrade to take advantage of the new conditions provided by the upgraded High-Luminosity LHC. The Trigger and Data Acquisition system (TDAQ) will record data at unprecedented rates: the detectors will be read out at 1 MHz generating around 5 TB/s of data. The Dataflow system (DF), component of TDAQ, introduces a novel design: readout data are buffered on persistent storage while the event filtering system analyses them to select 10000 events per second for a total recorded throughput of around 60 GB/s. This approach allows for decoupling the detector activity from the event selection process. New challenges then arise for DF: design and implement a distributed, reliable, persistent storage system supporting several TB/s of aggregated throughput while providing tens of PB of capacity. In this paper we first describe some of the challenges that DF is facing: data safety with persistent storage limitations, indexing of data at high-granularity in a highly-distributed system, and high-performance management of storage capacity. Then the ongoing R&D to address each of the them is presented and the performance achieved with a working prototype is shown. |
url |
https://www.epj-conferences.org/articles/epjconf/pdf/2021/05/epjconf_chep2021_04014.pdf |
work_keys_str_mv |
AT abedabudadam designofaresilienthighthroughputpersistentstoragesystemfortheatlasphaseiidaqsystem AT bonaventuramatias designofaresilienthighthroughputpersistentstoragesystemfortheatlasphaseiidaqsystem AT farinaedoardo designofaresilienthighthroughputpersistentstoragesystemfortheatlasphaseiidaqsystem AT legofffabrice designofaresilienthighthroughputpersistentstoragesystemfortheatlasphaseiidaqsystem |
_version_ |
1721195799646306304 |