Extensible Networked-storage Virtualization with Metadata Management at the Block Level

Increased scaling costs and lack of desired features is leading to the evolution of high-performance storage systems from centralized architectures and specialized hardware to decentralized, commodity storage clusters. Existing systems try to address storage cost and management issues at the filesys...

Full description

Bibliographic Details
Main Author: Flouris, Michail D.
Other Authors: Bilas, Angelos
Language:en_ca
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/1807/17759
Description
Summary:Increased scaling costs and lack of desired features is leading to the evolution of high-performance storage systems from centralized architectures and specialized hardware to decentralized, commodity storage clusters. Existing systems try to address storage cost and management issues at the filesystem level. Besides dictating the use of a specific filesystem, however, this approach leads to increased complexity and load imbalance towards the file-server side, which in turn increase costs to scale. In this thesis, we examine these problems at the block-level. This approach has several advantages, such as transparency, cost-efficiency, better resource utilization, simplicity and easier management. First of all, we explore the mechanisms, the merits, and the overheads associated with advanced metadata-intensive functionality at the block level, by providing versioning at the block level. We find that block-level versioning has low overhead and offers transparency and simplicity advantages over filesystem-based approaches. Secondly, we study the problem of providing extensibility required by diverse and changing application needs that may use a single storage system. We provide support for (i)adding desired functions as block-level extensions, and (ii)flexibly combining them to create modular I/O hierarchies. In this direction, we design, implement and evaluate an extensible block-level storage virtualization framework, Violin, with support for metadata-intensive functions. Extending Violin we build Orchestra, an extensible framework for cluster storage virtualization and scalable storage sharing at the block-level. We show that Orchestra's enhanced block interface can substantially simplify the design of higher-level storage services, such as cluster filesystems, while being scalable. Finally, we consider the problem of consistency and availability in decentralized commodity clusters. We propose RIBD, a novel storage system that provides support for handling both data and metadata consistency issues at the block layer. RIBD uses the notion of consistency intervals (CIs) to provide fine-grain consistency semantics on sequences of block level operations by means of a lightweight transactional mechanism. RIBD relies on Orchestra's virtualization mechanisms and uses a roll-back recovery mechanism based on low-overhead block-level versioning. We evaluate RIBD on a cluster of 24 nodes, and find that it performs comparably to two popular cluster filesystems, PVFS and GFS, while offering stronger consistency guarantees.