Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters

In this dissertation, I first incorporate declustered redundant array of independent disks (RAID) technology in the existing system by maximizing the aggregated recovery I/O and accelerating post-failure remediation. Our analytical model affirms the accelerated data recovery stage significantly impr...

Full description

Bibliographic Details
Main Author: Qiao, Zhi
Other Authors: Fu, Song
Format: Others
Language:English
Published: University of North Texas 2020
Subjects:
Online Access:https://digital.library.unt.edu/ark:/67531/metadc1707348/
id ndltd-unt.edu-info-ark-67531-metadc1707348
record_format oai_dc
spelling ndltd-unt.edu-info-ark-67531-metadc17073482021-11-25T05:32:58Z Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters Qiao, Zhi Reliability Storage Systems High Performance Computing Computer Science In this dissertation, I first incorporate declustered redundant array of independent disks (RAID) technology in the existing system by maximizing the aggregated recovery I/O and accelerating post-failure remediation. Our analytical model affirms the accelerated data recovery stage significantly improves storage reliability. Then I present a proactive data protection framework that augments storage availability and reliability. It utilizes the failure prediction methods to efficiently rescue data on drives before failures occur, which significantly reduces the storage downtime and lowers the risk of nested failures. Finally, I investigate how an active storage system enables energy-efficient computing. I explore an emerging storage device named Ethernet drive to offload data-intensive workloads from the host to drives and process the data on drives. It not only minimizes data movement and power usage, but also enhances data availability and storage scalability. In summary, my dissertation research provides intelligence at the drive, storage node, and system levels to tackle the rising reliability challenge in modern HPC datacenters. The results indicate that this novel storage paradigm cost-effectively improves storage scalability, availability, and reliability. University of North Texas Fu, Song Kavi, Krishna Yuan, Xiaohui Chen, Hsing-Bung 2020-08 Thesis or Dissertation xii, 125 pages Text local-cont-no: submission_2164 https://digital.library.unt.edu/ark:/67531/metadc1707348/ ark: ark:/67531/metadc1707348 English Public Qiao, Zhi Copyright Copyright is held by the author, unless otherwise noted. All rights Reserved.
collection NDLTD
language English
format Others
sources NDLTD
topic Reliability
Storage Systems
High Performance Computing
Computer Science
spellingShingle Reliability
Storage Systems
High Performance Computing
Computer Science
Qiao, Zhi
Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
description In this dissertation, I first incorporate declustered redundant array of independent disks (RAID) technology in the existing system by maximizing the aggregated recovery I/O and accelerating post-failure remediation. Our analytical model affirms the accelerated data recovery stage significantly improves storage reliability. Then I present a proactive data protection framework that augments storage availability and reliability. It utilizes the failure prediction methods to efficiently rescue data on drives before failures occur, which significantly reduces the storage downtime and lowers the risk of nested failures. Finally, I investigate how an active storage system enables energy-efficient computing. I explore an emerging storage device named Ethernet drive to offload data-intensive workloads from the host to drives and process the data on drives. It not only minimizes data movement and power usage, but also enhances data availability and storage scalability. In summary, my dissertation research provides intelligence at the drive, storage node, and system levels to tackle the rising reliability challenge in modern HPC datacenters. The results indicate that this novel storage paradigm cost-effectively improves storage scalability, availability, and reliability.
author2 Fu, Song
author_facet Fu, Song
Qiao, Zhi
author Qiao, Zhi
author_sort Qiao, Zhi
title Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
title_short Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
title_full Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
title_fullStr Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
title_full_unstemmed Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
title_sort building reliable and cost-effective storage systems for high-performance computing datacenters
publisher University of North Texas
publishDate 2020
url https://digital.library.unt.edu/ark:/67531/metadc1707348/
work_keys_str_mv AT qiaozhi buildingreliableandcosteffectivestoragesystemsforhighperformancecomputingdatacenters
_version_ 1719495556492951552