Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems

Storage systems used for supercomputers and high performance computing (HPC) centers exhibit load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration an...

Full description

Bibliographic Details
Main Author: Banavathi Srinivasa, Sangeetha
Other Authors: Computer Science
Format: Others
Published: Virginia Tech 2018
Subjects:
Online Access:http://hdl.handle.net/10919/86272
id ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-86272
record_format oai_dc
spelling ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-862722021-03-13T05:31:39Z Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems Banavathi Srinivasa, Sangeetha Computer Science Butt, Ali R. Raghvendra, Sharath Polys, Nicholas Fearing Lustre storage system Markov chain model Max-flow algorithm Publisher-Subscriber I/O load balance Storage systems used for supercomputers and high performance computing (HPC) centers exhibit load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration and control. For example, the extant Lustre parallel storage system, which forms the backend storage for many HPC centers, comprises numerous components, all connected in custom network topologies, and serve varying demands of large number of users and applications. Consequently, some storage servers can be more loaded than others, creating bottlenecks, and reducing overall application I/O performance. Existing solutions focus on per application load balancing, and thus are not effective due to the lack of a global view of the system. In this thesis, we adopt a data-driven quantitative approach to load balance the I/O servers at extreme scale. To this end, we design a global mapper on Lustre Metadata Server (MDS), which gathers runtime statistics collected from key storage components on the I/O path, and applies Markov chain modeling and a dynamic maximum flow algorithm to decide where data should be placed in a load-balanced fashion. Evaluation using a realistic system simulator shows that our approach yields better load balancing, which in turn can help yield higher end-to-end performance. Master of Science 2018-12-08T07:00:33Z 2018-12-08T07:00:33Z 2017-06-15 Thesis vt_gsexam:11787 http://hdl.handle.net/10919/86272 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf Virginia Tech
collection NDLTD
format Others
sources NDLTD
topic Lustre storage system
Markov chain model
Max-flow algorithm
Publisher-Subscriber
I/O load balance
spellingShingle Lustre storage system
Markov chain model
Max-flow algorithm
Publisher-Subscriber
I/O load balance
Banavathi Srinivasa, Sangeetha
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
description Storage systems used for supercomputers and high performance computing (HPC) centers exhibit load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration and control. For example, the extant Lustre parallel storage system, which forms the backend storage for many HPC centers, comprises numerous components, all connected in custom network topologies, and serve varying demands of large number of users and applications. Consequently, some storage servers can be more loaded than others, creating bottlenecks, and reducing overall application I/O performance. Existing solutions focus on per application load balancing, and thus are not effective due to the lack of a global view of the system. In this thesis, we adopt a data-driven quantitative approach to load balance the I/O servers at extreme scale. To this end, we design a global mapper on Lustre Metadata Server (MDS), which gathers runtime statistics collected from key storage components on the I/O path, and applies Markov chain modeling and a dynamic maximum flow algorithm to decide where data should be placed in a load-balanced fashion. Evaluation using a realistic system simulator shows that our approach yields better load balancing, which in turn can help yield higher end-to-end performance. === Master of Science
author2 Computer Science
author_facet Computer Science
Banavathi Srinivasa, Sangeetha
author Banavathi Srinivasa, Sangeetha
author_sort Banavathi Srinivasa, Sangeetha
title Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
title_short Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
title_full Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
title_fullStr Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
title_full_unstemmed Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
title_sort towards data-driven i/o load balancing in extreme-scale storage systems
publisher Virginia Tech
publishDate 2018
url http://hdl.handle.net/10919/86272
work_keys_str_mv AT banavathisrinivasasangeetha towardsdatadrivenioloadbalancinginextremescalestoragesystems
_version_ 1719383584635092992