Inter-server communication in the Mammoth file system

The Mammoth file system uses a collection of loosely coupled file servers to provide a highly available, widely distributed, scalable file system. Mammoth servers act as peers to cooperatively provide replicated back-up free storage. Mammoth approaches the management of file data at the granulari...

Full description

Bibliographic Details
Main Author: Pomkoski, Jody James.
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/13380
Description
Summary:The Mammoth file system uses a collection of loosely coupled file servers to provide a highly available, widely distributed, scalable file system. Mammoth servers act as peers to cooperatively provide replicated back-up free storage. Mammoth approaches the management of file data at the granularity of whole files. Files in Mammoth are versioned upon any operation that would modify the data. Versions are immutable and retained by the system. Versioning simplifies conflict management and permits relaxing the consistency model to tolerate the latency inherent in propagating replicas. Each file or directory within the system expresses its replication and distribution requirements explicitly in the meta-data by naming cooperating nodes by IP address. The resulting system is thus inherently more scalable because all nodes do not need to monitor the entire Mammoth system. This thesis describes the design and prototype implementation of a Distribution Manager. This module accesses file and directory meta-data and replication policies to direct its inter-server communication. The distribution manager is composed of two threads which run within a modified userlevel NFS server. These threads provide network communication via the TCP/IP protocol. A socket cache is implemented in order to amortise the relatively expensive set-up of stream sockets. Shared message queues allow asynchronous message processing across threads and nodes. Failures are actively detected and automatically trigger fault recovery.