Summary: | Modern file systems associate the deletion of a file with the immediate release of
storage, and file writes with the irrevocable change of file contents. We argue that
this behavior is a relic of the past, when disk storage was a scarce resource. Today,
large cheap disks make it possible for the file system to protect valuable data from
accidental delete or overwrite.
This thesis describes the design, implementation, and performance of the
Elephant file system, which automatically retains all important versions of user
files. Users name previous file versions by combining a traditional pathname with
a time when the desired version of a file or directory existed. Storage in Elephant
is managed by the system using file-grain retention policies specified by users. This
approach contrasts with checkpointing file systems such as Plan-9, AFS, and WAFL
that periodically generate efficient checkpoints of entire file systems and thus restrict
retention to be guided by a single policy for all files within that file system. Most
file systems have no built in support for recovering data.
|