Summary: | Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005. === Includes bibliographical references (leaves 137-140). === We present a trace-hash scheme and an adaptive tree-trace scheme to improve the performance of checking the integrity of arbitrarily-large untrusted data, when using only a small fixed-sized trusted state. Currently, hash trees are used to check the data. In many systems that use hash trees, programs perform many data operations before performing a critical operation that exports a result outside of the program's execution environment. The trace-hash and adaptive tree-trace schemes check sequences of data operations. For each of the schemes, for all programs, as the average number of times the program accesses data between critical operations increases, the scheme's bandwidth overhead approaches a constant bandwidth overhead. The trace-hash scheme, intuitively, maintains a "write trace" and a "read trace" of the write and read operations on the untrusted data. At runtime, the traces are updated with a minimal constant-sized bandwidth overhead so that the integrity of a sequence of data operations can be verified at a later time. To maintain the traces in a small fixed-sized trusted space, we introduce a new cryptographic tool, incremental multiset hash functions, to update the traces. To check a sequence of operations, a separate integrity-check operation is performed using the traces. === (cont.) The integrity-check operation is performed whenever the program executes a critical operation: a critical operation acts as a signal indicating when it is necessary to perform the integrity-check operation. When sequences of operations are checked, the trace-hash scheme significantly outperforms the hash tree. Though the trace-hash scheme does not incur the logarithmic bandwidth overhead of the hash tree, its integrity-check operation needs to read all of the data that was used since the beginning of the program's execution. When critical operations occur infrequently, the amortized cost over the number of data operations performed of the integrity-check operation is small and the trace-hash scheme performs very well. However, when critical operations occur frequently, the amortized cost of the integrity-check operation becomes prohibitively large; in this case, the performance of the trace-hash scheme is not good and is much worse than that of the hash tree. Thus, though the trace-hash scheme performs very well when checks are infrequent, it cannot be widely-used because its performance is poor when checks are more frequent. To this end, we also introduce an adaptive tree-trace scheme to optimize the trace-hash scheme and to capture the best features of both the hash tree and trace-hash schemes. === (cont.) The adaptive tree-trace scheme has three features. Firstly, the scheme is adaptive, allowing programs to benefit from its features without any program modification. Secondly, for all programs, the scheme's bandwidth overhead is guaranteed never to be worse than a parameterizable worst-case bound, expressed relative to the bandwidth overhead of the hash tree if the hash tree had been used to check the integrity of the data. Finally, for all programs, as the average number times the program accesses data between critical operations increases, the scheme's bandwidth overhead moves from a logarithmic to a constant bandwidth overhead. === by Dwaine E. Clarke. === Ph.D.
|