Accelerating Science Impact through Big Data Workflow Management and Supercomputing
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2016-01-01
|
Series: | EPJ Web of Conferences |
Online Access: | http://dx.doi.org/10.1051/epjconf/201610801003 |
id |
doaj-ddba267ce4a94263bb39ff2c7c6a3d55 |
---|---|
record_format |
Article |
spelling |
doaj-ddba267ce4a94263bb39ff2c7c6a3d552021-08-02T12:42:51ZengEDP SciencesEPJ Web of Conferences2100-014X2016-01-011080100310.1051/epjconf/201610801003epjconf_mmcp2016_01003Accelerating Science Impact through Big Data Workflow Management and SupercomputingDe K.0Klimentov A.Maeno T.1Mashinistov R.2Nilsson P.3Oleynik D.Panitkin S.4Ryabinkin E.5Wenaus T.6Physics Department, University of Texas at ArlingtonPhysics Department, Brookhaven National Laboratory, UptonKurchatov complex of NBIC-technologiesPhysics Department, Brookhaven National Laboratory, UptonPhysics Department, Brookhaven National Laboratory, UptonKurchatov complex of NBIC-technologiesPhysics Department, Brookhaven National Laboratory, UptonThe Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.http://dx.doi.org/10.1051/epjconf/201610801003 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
De K. Klimentov A. Maeno T. Mashinistov R. Nilsson P. Oleynik D. Panitkin S. Ryabinkin E. Wenaus T. |
spellingShingle |
De K. Klimentov A. Maeno T. Mashinistov R. Nilsson P. Oleynik D. Panitkin S. Ryabinkin E. Wenaus T. Accelerating Science Impact through Big Data Workflow Management and Supercomputing EPJ Web of Conferences |
author_facet |
De K. Klimentov A. Maeno T. Mashinistov R. Nilsson P. Oleynik D. Panitkin S. Ryabinkin E. Wenaus T. |
author_sort |
De K. |
title |
Accelerating Science Impact through Big Data Workflow Management and Supercomputing |
title_short |
Accelerating Science Impact through Big Data Workflow Management and Supercomputing |
title_full |
Accelerating Science Impact through Big Data Workflow Management and Supercomputing |
title_fullStr |
Accelerating Science Impact through Big Data Workflow Management and Supercomputing |
title_full_unstemmed |
Accelerating Science Impact through Big Data Workflow Management and Supercomputing |
title_sort |
accelerating science impact through big data workflow management and supercomputing |
publisher |
EDP Sciences |
series |
EPJ Web of Conferences |
issn |
2100-014X |
publishDate |
2016-01-01 |
description |
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing. |
url |
http://dx.doi.org/10.1051/epjconf/201610801003 |
work_keys_str_mv |
AT dek acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT klimentova acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT maenot acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT mashinistovr acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT nilssonp acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT oleynikd acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT panitkins acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT ryabinkine acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing AT wenaust acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing |
_version_ |
1721232456278867968 |