Elephant : the file system that never forgets

Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the...

Full description

Bibliographic Details
Main Author: Santry, Douglas J.
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/9748
Description
Summary:Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the file system to protect valuable data from accidental delete or overwrite. This thesis describes the design, implementation, and performance of the Elephant file system, which automatically retains all important versions of user files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory existed. Storage in Elephant is managed by the system using file-grain retention policies specified by users. This approach contrasts with checkpointing file systems such as Plan-9, AFS, and WAFL that periodically generate efficient checkpoints of entire file systems and thus restrict retention to be guided by a single policy for all files within that file system. Most file systems have no built in support for recovering data.