Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks

It is important for large scale scientific simulations to read and write data from parallel file system efficiently. To decrease the I/O bottleneck of scientific simulations, many middle-ware solutions had been developed, and the two-phase scheme is one well-known I/O algorithm designed for collecti...

Full description

Bibliographic Details
Main Authors: Weifeng Liu, Linping Wu, Xiaowen Xu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9057562/
id doaj-8c4759233a1b4d41bfef4e64a2b4cdd6
record_format Article
spelling doaj-8c4759233a1b4d41bfef4e64a2b4cdd62021-03-30T03:16:12ZengIEEEIEEE Access2169-35362020-01-018669176693010.1109/ACCESS.2020.29859289057562Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical NetworksWeifeng Liu0https://orcid.org/0000-0002-1160-6687Linping Wu1https://orcid.org/0000-0002-0146-2747Xiaowen Xu2https://orcid.org/0000-0001-6032-454XInstitute of Applied Physics and Computational Mathematics, Beijing, ChinaInstitute of Applied Physics and Computational Mathematics, Beijing, ChinaInstitute of Applied Physics and Computational Mathematics, Beijing, ChinaIt is important for large scale scientific simulations to read and write data from parallel file system efficiently. To decrease the I/O bottleneck of scientific simulations, many middle-ware solutions had been developed, and the two-phase scheme is one well-known I/O algorithm designed for collective I/O operations. During two-phase I/O based operations, a subset of processes is selected to aggregate non-contiguous pieces of data in the shuffle phase before doing collective reads/writes in the I/O phase. In the meantime, the tapered hierarchical network has long been proposed in order to decrease procurement and power cost. Higher bandwidth and lower latency can be provided in the low levels of tapered hierarchical network. In this paper, we presented a new implementation of two-phase I/O algorithm which takes into consideration the communication pattern and the topology of tapered hierarchical network when scheduling the inter-process communications during the shuffle phase. We validated the new algorithm on our high performance computers and obtained the experimental data on the I/O kernels of some simulations. A significant improvement of the shuffle phase performance was achieved by our new algorithm when compared with the previous two-phase I/O implementations.https://ieeexplore.ieee.org/document/9057562/MPI-IOtwo-phase I/Ocommunication topologycommunication optimization
collection DOAJ
language English
format Article
sources DOAJ
author Weifeng Liu
Linping Wu
Xiaowen Xu
spellingShingle Weifeng Liu
Linping Wu
Xiaowen Xu
Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
IEEE Access
MPI-IO
two-phase I/O
communication topology
communication optimization
author_facet Weifeng Liu
Linping Wu
Xiaowen Xu
author_sort Weifeng Liu
title Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
title_short Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
title_full Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
title_fullStr Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
title_full_unstemmed Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks
title_sort topology aware algorithm for two-phase i/o in clusters with tapered hierarchical networks
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description It is important for large scale scientific simulations to read and write data from parallel file system efficiently. To decrease the I/O bottleneck of scientific simulations, many middle-ware solutions had been developed, and the two-phase scheme is one well-known I/O algorithm designed for collective I/O operations. During two-phase I/O based operations, a subset of processes is selected to aggregate non-contiguous pieces of data in the shuffle phase before doing collective reads/writes in the I/O phase. In the meantime, the tapered hierarchical network has long been proposed in order to decrease procurement and power cost. Higher bandwidth and lower latency can be provided in the low levels of tapered hierarchical network. In this paper, we presented a new implementation of two-phase I/O algorithm which takes into consideration the communication pattern and the topology of tapered hierarchical network when scheduling the inter-process communications during the shuffle phase. We validated the new algorithm on our high performance computers and obtained the experimental data on the I/O kernels of some simulations. A significant improvement of the shuffle phase performance was achieved by our new algorithm when compared with the previous two-phase I/O implementations.
topic MPI-IO
two-phase I/O
communication topology
communication optimization
url https://ieeexplore.ieee.org/document/9057562/
work_keys_str_mv AT weifengliu topologyawarealgorithmfortwophaseioinclusterswithtaperedhierarchicalnetworks
AT linpingwu topologyawarealgorithmfortwophaseioinclusterswithtaperedhierarchicalnetworks
AT xiaowenxu topologyawarealgorithmfortwophaseioinclusterswithtaperedhierarchicalnetworks
_version_ 1724183771773665280