Accelerating Science Impact through Big Data Workflow Management and Supercomputing

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an...

Full description

Bibliographic Details
Main Authors: De K., Klimentov A., Maeno T., Mashinistov R., Nilsson P., Oleynik D., Panitkin S., Ryabinkin E., Wenaus T.
Format: Article
Language:English
Published: EDP Sciences 2016-01-01
Series:EPJ Web of Conferences
Online Access:http://dx.doi.org/10.1051/epjconf/201610801003
id doaj-ddba267ce4a94263bb39ff2c7c6a3d55
record_format Article
spelling doaj-ddba267ce4a94263bb39ff2c7c6a3d552021-08-02T12:42:51ZengEDP SciencesEPJ Web of Conferences2100-014X2016-01-011080100310.1051/epjconf/201610801003epjconf_mmcp2016_01003Accelerating Science Impact through Big Data Workflow Management and SupercomputingDe K.0Klimentov A.Maeno T.1Mashinistov R.2Nilsson P.3Oleynik D.Panitkin S.4Ryabinkin E.5Wenaus T.6Physics Department, University of Texas at ArlingtonPhysics Department, Brookhaven National Laboratory, UptonKurchatov complex of NBIC-technologiesPhysics Department, Brookhaven National Laboratory, UptonPhysics Department, Brookhaven National Laboratory, UptonKurchatov complex of NBIC-technologiesPhysics Department, Brookhaven National Laboratory, UptonThe Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.http://dx.doi.org/10.1051/epjconf/201610801003
collection DOAJ
language English
format Article
sources DOAJ
author De K.
Klimentov A.
Maeno T.
Mashinistov R.
Nilsson P.
Oleynik D.
Panitkin S.
Ryabinkin E.
Wenaus T.
spellingShingle De K.
Klimentov A.
Maeno T.
Mashinistov R.
Nilsson P.
Oleynik D.
Panitkin S.
Ryabinkin E.
Wenaus T.
Accelerating Science Impact through Big Data Workflow Management and Supercomputing
EPJ Web of Conferences
author_facet De K.
Klimentov A.
Maeno T.
Mashinistov R.
Nilsson P.
Oleynik D.
Panitkin S.
Ryabinkin E.
Wenaus T.
author_sort De K.
title Accelerating Science Impact through Big Data Workflow Management and Supercomputing
title_short Accelerating Science Impact through Big Data Workflow Management and Supercomputing
title_full Accelerating Science Impact through Big Data Workflow Management and Supercomputing
title_fullStr Accelerating Science Impact through Big Data Workflow Management and Supercomputing
title_full_unstemmed Accelerating Science Impact through Big Data Workflow Management and Supercomputing
title_sort accelerating science impact through big data workflow management and supercomputing
publisher EDP Sciences
series EPJ Web of Conferences
issn 2100-014X
publishDate 2016-01-01
description The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.
url http://dx.doi.org/10.1051/epjconf/201610801003
work_keys_str_mv AT dek acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT klimentova acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT maenot acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT mashinistovr acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT nilssonp acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT oleynikd acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT panitkins acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT ryabinkine acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
AT wenaust acceleratingscienceimpactthroughbigdataworkflowmanagementandsupercomputing
_version_ 1721232456278867968