Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques

This work is an investigation into reconstructing fragmented ASCII files based on content analysis motivated by a desire to demonstrate machine learning's applicability to Digital Forensics. Using a categorized corpus of Usenet, Bulletin Board Systems, and other assorted documents a series of e...

Full description

Bibliographic Details
Main Author: Roux, Brian
Format: Others
Published: ScholarWorks@UNO 2008
Subjects:
SVM
Online Access:http://scholarworks.uno.edu/td/881
http://scholarworks.uno.edu/cgi/viewcontent.cgi?article=1861&context=td
id ndltd-uno.edu-oai-scholarworks.uno.edu-td-1861
record_format oai_dc
spelling ndltd-uno.edu-oai-scholarworks.uno.edu-td-18612016-10-21T17:04:51Z Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques Roux, Brian This work is an investigation into reconstructing fragmented ASCII files based on content analysis motivated by a desire to demonstrate machine learning's applicability to Digital Forensics. Using a categorized corpus of Usenet, Bulletin Board Systems, and other assorted documents a series of experiments are conducted using machine learning techniques to train classifiers which are able to identify fragments belonging to the same original file. The primary machine learning method used is the Support Vector Machine with a variety of feature extractions to train from. Additional work is done in training committees of SVMs to boost the classification power over the individual SVMs, as well as the development of a method to tune SVM kernel parameters using a genetic algorithm. Attention is given to the applicability of Information Retrieval techniques to file fragments, as well as an analysis of textual artifacts which are not present in standard dictionaries. 2008-12-19T08:00:00Z text application/pdf http://scholarworks.uno.edu/td/881 http://scholarworks.uno.edu/cgi/viewcontent.cgi?article=1861&context=td University of New Orleans Theses and Dissertations ScholarWorks@UNO Machine Learning File Carving Fragmented Files Support Vector Machines SVM Digital Forensics Information Retrieval
collection NDLTD
format Others
sources NDLTD
topic Machine Learning
File Carving
Fragmented Files
Support Vector Machines
SVM
Digital Forensics
Information Retrieval
spellingShingle Machine Learning
File Carving
Fragmented Files
Support Vector Machines
SVM
Digital Forensics
Information Retrieval
Roux, Brian
Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
description This work is an investigation into reconstructing fragmented ASCII files based on content analysis motivated by a desire to demonstrate machine learning's applicability to Digital Forensics. Using a categorized corpus of Usenet, Bulletin Board Systems, and other assorted documents a series of experiments are conducted using machine learning techniques to train classifiers which are able to identify fragments belonging to the same original file. The primary machine learning method used is the Support Vector Machine with a variety of feature extractions to train from. Additional work is done in training committees of SVMs to boost the classification power over the individual SVMs, as well as the development of a method to tune SVM kernel parameters using a genetic algorithm. Attention is given to the applicability of Information Retrieval techniques to file fragments, as well as an analysis of textual artifacts which are not present in standard dictionaries.
author Roux, Brian
author_facet Roux, Brian
author_sort Roux, Brian
title Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
title_short Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
title_full Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
title_fullStr Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
title_full_unstemmed Reconstructing Textual File Fragments Using Unsupervised Machine Learning Techniques
title_sort reconstructing textual file fragments using unsupervised machine learning techniques
publisher ScholarWorks@UNO
publishDate 2008
url http://scholarworks.uno.edu/td/881
http://scholarworks.uno.edu/cgi/viewcontent.cgi?article=1861&context=td
work_keys_str_mv AT rouxbrian reconstructingtextualfilefragmentsusingunsupervisedmachinelearningtechniques
_version_ 1718388024292671488