Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks

In this comparative study based on experimentation it is demonstrated that the two evaluated machine learning techniques, Bayesian networks and random forests, have similar predictive power in the domain of classifying torrents on BitTorrent file sharing networks. This work was performed in two step...

Full description

Bibliographic Details
Main Author: Petersson, Andreas
Format: Others
Language:English
Published: Högskolan i Skövde, Institutionen för informationsteknologi 2015
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-11180
id ndltd-UPSALLA1-oai-DiVA.org-his-11180
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-his-111802018-01-12T05:10:36ZData mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian NetworksengPetersson, AndreasHögskolan i Skövde, Institutionen för informationsteknologi2015machine learningrandom forestsbayesian networkbittorrentfile sharingComputer SciencesDatavetenskap (datalogi)In this comparative study based on experimentation it is demonstrated that the two evaluated machine learning techniques, Bayesian networks and random forests, have similar predictive power in the domain of classifying torrents on BitTorrent file sharing networks. This work was performed in two steps. First, a literature analysis was performed to gain insight into how the two techniques work and what types of attacks exist against BitTorrent file sharing networks. After the literature analysis, an experiment was performed to evaluate the accuracy of the two techniques. The results show no significant advantage of using one algorithm over the other when only considering accuracy. However, ease of use lies in Random forests’ favour because the technique requires little pre-processing of the data and still generates accurate results with few false positives. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-11180application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic machine learning
random forests
bayesian network
bittorrent
file sharing
Computer Sciences
Datavetenskap (datalogi)
spellingShingle machine learning
random forests
bayesian network
bittorrent
file sharing
Computer Sciences
Datavetenskap (datalogi)
Petersson, Andreas
Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
description In this comparative study based on experimentation it is demonstrated that the two evaluated machine learning techniques, Bayesian networks and random forests, have similar predictive power in the domain of classifying torrents on BitTorrent file sharing networks. This work was performed in two steps. First, a literature analysis was performed to gain insight into how the two techniques work and what types of attacks exist against BitTorrent file sharing networks. After the literature analysis, an experiment was performed to evaluate the accuracy of the two techniques. The results show no significant advantage of using one algorithm over the other when only considering accuracy. However, ease of use lies in Random forests’ favour because the technique requires little pre-processing of the data and still generates accurate results with few false positives.
author Petersson, Andreas
author_facet Petersson, Andreas
author_sort Petersson, Andreas
title Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
title_short Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
title_full Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
title_fullStr Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
title_full_unstemmed Data mining file sharing metadata : A comparison between Random Forests Classificiation and Bayesian Networks
title_sort data mining file sharing metadata : a comparison between random forests classificiation and bayesian networks
publisher Högskolan i Skövde, Institutionen för informationsteknologi
publishDate 2015
url http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-11180
work_keys_str_mv AT peterssonandreas dataminingfilesharingmetadataacomparisonbetweenrandomforestsclassificiationandbayesiannetworks
_version_ 1718605495498964992