Improving sector hash carving with rule-based and entropy-based non-probative block filters

Approved for public release; distribution is unlimited === Digital forensic investigators have traditionally used file hashes to identify known content on searched media. Recently, sector hashing has been proposed as an alternative identification method, in which files are broken up into blocks, whi...

Full description

Bibliographic Details
Main Author: Gutierrez-Villarreal, Francisco Javier
Other Authors: McCarrin, Michael R.
Published: Monterey, California: Naval Postgraduate School 2015
Online Access:http://hdl.handle.net/10945/45194
Description
Summary:Approved for public release; distribution is unlimited === Digital forensic investigators have traditionally used file hashes to identify known content on searched media. Recently, sector hashing has been proposed as an alternative identification method, in which files are broken up into blocks, which are then compared to sectors on searched media. Since sectors are read sequentially without accessing the file system, sector hashing can be parallelized easily and is faster than traditional methods. In addition, sector hashing can identify partial files, and does not require an exact file match. In some cases, the presence of even a single block is sufficient to demonstrate with high probability that a file resides on a drive. However, non-probative blocks, common across many files, generate false positive matches; a problem that must be addressed before sector hashing can be adopted. We conduct 7 experiments in two phases to filter non-probative blocks. Our first phase uses rule-based and entropy-based non-probative block filters to improve matching against all file types. In the second phase, we restrict the problem to JPEG files. We find that for general hash-based carving, a rule-based approach outperforms a simple entropy threshold. When searching for JPEGs, we find that an entropy threshold of 10.9 gives a precision of 80% and an accuracy of 99%.