Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts

The widespread use of social media, along with the possibilities to conceal one’s identity in the fibrillation of ubiquitous technology, combined with crime and terrorism becoming digitized, has increased the need of possibilities to find out who hides behind an anonymous alias. This report deals wi...

Full description

Bibliographic Details
Main Author: Lagerholm, Filip
Format: Others
Language:English
Published: Mälardalens högskola, Akademin för innovation, design och teknik 2017
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35551
id ndltd-UPSALLA1-oai-DiVA.org-mdh-35551
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-mdh-355512018-01-14T05:10:32ZUsing Artificial Intelligence to Verify Authorship of Anonymous Social Media PostsengLagerholm, FilipMälardalens högskola, Akademin för innovation, design och teknik2017Computer SciencesDatavetenskap (datalogi)The widespread use of social media, along with the possibilities to conceal one’s identity in the fibrillation of ubiquitous technology, combined with crime and terrorism becoming digitized, has increased the need of possibilities to find out who hides behind an anonymous alias. This report deals with authorship verification of posts written on Twitter, with the purpose of investigating whether it is possible to develop an auxiliary tool that can be used in crime investigation activities. The main research question in this report is whether a set of tweets written by an anonymous user can be matched to another set of tweets written by a known user, and, based on their linguistic styles, if it is possible to calculate a probability of whether the authors are the same. The report also examines the question of how linguistic styles can be extracted for use in an artificially intelligent classification, and how much data is needed to get adequate results. The subject matter is interesting as the work described in this report concerns a potential future scenario where digital crimes are difficult to investigate with traditional network-based tracking techniques. The approach to the problem is to evaluate traditional methods of feature extraction in natural language processing, and by classifying the features using a type of recurrent neural network called Long Short-Term Memory. While the best result in an experiment that was carried out achieved an accuracy of 93.32%, the overall results showed that the choice of representation, and amount of data used, is crucial. This thesis complements the existing knowledge as very short texts, in the form of social media posts, are in focus. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35551application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Computer Sciences
Datavetenskap (datalogi)
spellingShingle Computer Sciences
Datavetenskap (datalogi)
Lagerholm, Filip
Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
description The widespread use of social media, along with the possibilities to conceal one’s identity in the fibrillation of ubiquitous technology, combined with crime and terrorism becoming digitized, has increased the need of possibilities to find out who hides behind an anonymous alias. This report deals with authorship verification of posts written on Twitter, with the purpose of investigating whether it is possible to develop an auxiliary tool that can be used in crime investigation activities. The main research question in this report is whether a set of tweets written by an anonymous user can be matched to another set of tweets written by a known user, and, based on their linguistic styles, if it is possible to calculate a probability of whether the authors are the same. The report also examines the question of how linguistic styles can be extracted for use in an artificially intelligent classification, and how much data is needed to get adequate results. The subject matter is interesting as the work described in this report concerns a potential future scenario where digital crimes are difficult to investigate with traditional network-based tracking techniques. The approach to the problem is to evaluate traditional methods of feature extraction in natural language processing, and by classifying the features using a type of recurrent neural network called Long Short-Term Memory. While the best result in an experiment that was carried out achieved an accuracy of 93.32%, the overall results showed that the choice of representation, and amount of data used, is crucial. This thesis complements the existing knowledge as very short texts, in the form of social media posts, are in focus.
author Lagerholm, Filip
author_facet Lagerholm, Filip
author_sort Lagerholm, Filip
title Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
title_short Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
title_full Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
title_fullStr Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
title_full_unstemmed Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
title_sort using artificial intelligence to verify authorship of anonymous social media posts
publisher Mälardalens högskola, Akademin för innovation, design och teknik
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35551
work_keys_str_mv AT lagerholmfilip usingartificialintelligencetoverifyauthorshipofanonymoussocialmediaposts
_version_ 1718609331485671424