Summary: | Approved for public release; distribution is unlimited === This thesis investigates a novel approach to identifying discriminating features of communications involving abusive hosts. The technique uses per-packet TCP header and timing features to identify congestion, flow-control, and other low-level network and system characteristics. These characteristics are inherent to the poorly connected, under-provisioned, low-end, and overloaded hosts or links typical of abusive infrastructure making them difficult for an adversary to manipulate. Supervised classifiers use these features to infer likely abusive network hosts. Prior work investigates such features to opportunistically identify inbound abusive traffic, this thesis seeks to perform active probing to generally characterize abusive infrastructure. Our approach is IP address and content agnostic, and therefore privacy-preserving to permit wider deployment than known-abusive web sites, we achieve a classification accuracy of 94 percent with a 3 percent false positive rate using only transport features. Our results suggest that transport traffic analysis can block and identify, in real-time, abusive hosts unknown to blocklists, and provide a difficult-to-subvert addition to existing schemes.
|