Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems
Storage systems used for supercomputers and high performance computing (HPC) centers exhibit load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration an...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Published: |
Virginia Tech
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10919/86272 |
id |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-86272 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-862722021-03-13T05:31:39Z Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems Banavathi Srinivasa, Sangeetha Computer Science Butt, Ali R. Raghvendra, Sharath Polys, Nicholas Fearing Lustre storage system Markov chain model Max-flow algorithm Publisher-Subscriber I/O load balance Storage systems used for supercomputers and high performance computing (HPC) centers exhibit load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration and control. For example, the extant Lustre parallel storage system, which forms the backend storage for many HPC centers, comprises numerous components, all connected in custom network topologies, and serve varying demands of large number of users and applications. Consequently, some storage servers can be more loaded than others, creating bottlenecks, and reducing overall application I/O performance. Existing solutions focus on per application load balancing, and thus are not effective due to the lack of a global view of the system. In this thesis, we adopt a data-driven quantitative approach to load balance the I/O servers at extreme scale. To this end, we design a global mapper on Lustre Metadata Server (MDS), which gathers runtime statistics collected from key storage components on the I/O path, and applies Markov chain modeling and a dynamic maximum flow algorithm to decide where data should be placed in a load-balanced fashion. Evaluation using a realistic system simulator shows that our approach yields better load balancing, which in turn can help yield higher end-to-end performance. Master of Science 2018-12-08T07:00:33Z 2018-12-08T07:00:33Z 2017-06-15 Thesis vt_gsexam:11787 http://hdl.handle.net/10919/86272 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf Virginia Tech |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Lustre storage system Markov chain model Max-flow algorithm Publisher-Subscriber I/O load balance |
spellingShingle |
Lustre storage system Markov chain model Max-flow algorithm Publisher-Subscriber I/O load balance Banavathi Srinivasa, Sangeetha Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
description |
Storage systems used for supercomputers and high performance computing (HPC) centers exhibit
load imbalance and resource contention. This is mainly due to two factors: the bursty nature of the
I/O of scientific applications; and the complex and distributed I/O path without centralized arbitration
and control. For example, the extant Lustre parallel storage system, which forms the backend
storage for many HPC centers, comprises numerous components, all connected in custom network
topologies, and serve varying demands of large number of users and applications. Consequently,
some storage servers can be more loaded than others, creating bottlenecks, and reducing overall
application I/O performance. Existing solutions focus on per application load balancing, and thus
are not effective due to the lack of a global view of the system.
In this thesis, we adopt a data-driven quantitative approach to load balance the I/O servers at
extreme scale. To this end, we design a global mapper on Lustre Metadata Server (MDS), which
gathers runtime statistics collected from key storage components on the I/O path, and applies
Markov chain modeling and a dynamic maximum flow algorithm to decide where data should
be placed in a load-balanced fashion. Evaluation using a realistic system simulator shows that our
approach yields better load balancing, which in turn can help yield higher end-to-end performance. === Master of Science |
author2 |
Computer Science |
author_facet |
Computer Science Banavathi Srinivasa, Sangeetha |
author |
Banavathi Srinivasa, Sangeetha |
author_sort |
Banavathi Srinivasa, Sangeetha |
title |
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
title_short |
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
title_full |
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
title_fullStr |
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
title_full_unstemmed |
Towards Data-Driven I/O Load Balancing in Extreme-Scale Storage Systems |
title_sort |
towards data-driven i/o load balancing in extreme-scale storage systems |
publisher |
Virginia Tech |
publishDate |
2018 |
url |
http://hdl.handle.net/10919/86272 |
work_keys_str_mv |
AT banavathisrinivasasangeetha towardsdatadrivenioloadbalancinginextremescalestoragesystems |
_version_ |
1719383584635092992 |