Computational approaches for verbal deception detection

Deception exists in all aspects of life and is particularly evident on the Web. Deception includes child sexual predators grooming victims online, medical news headlines with little medical evidence or scientific rigour, individuals claiming others’ work as their own, and systematic deception of com...

Full description

Bibliographic Details
Main Author: Vartapetiance, Anna
Other Authors: Gillam, L.
Published: University of Surrey 2015
Subjects:
004
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.635560
id ndltd-bl.uk-oai-ethos.bl.uk-635560
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-6355602019-03-05T15:41:42ZComputational approaches for verbal deception detectionVartapetiance, AnnaGillam, L.2015Deception exists in all aspects of life and is particularly evident on the Web. Deception includes child sexual predators grooming victims online, medical news headlines with little medical evidence or scientific rigour, individuals claiming others’ work as their own, and systematic deception of company shareholders and institutional investors leading to corporate collapses. This thesis explores the potential for automatic detection of deception. We investigate the nature of deception and the related cues, focusing in particular on Verbal Cues, and concluding that they cannot be readily generalised. We demonstrate how deception-specific features, based on sound hypotheses, can overcome related limitations by presenting approaches for three different examples of deception – namely Child Sexual Predator Detection (SPD), Authorship Identification (AI) and Intrinsic Plagiarism Detection (IPD). We further show how our approaches result in competitive levels of reliability. For SPD we develop our approach largely based on the commonality of requests for key personal information. To address AI, we introduce approaches based on a frequency-mean-variance and a frequency-only framework in order to detect strong associations between co-occurring patterns of a limited number of stopwords. Our IPD approaches are based on simple commonality of words at document level and usage of proper nouns; document sections lacking commonality can be identified as plagiarised. The frameworks of the International Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN) competitions provided an independent evaluation of the approaches. The SPD approach obtained an F1 score of 0.48. F1 scores of 0.47, 0.53 and 0.57 were achieved in AI tasks for PAN2012, 2013 and 2014 respectively. IPD yielded an overall accuracy of 91%. Through post-competition adaptations we also show how to improve the approaches and the scores and demonstrate the importance of suitable datasets and how most approaches are not easily transferable between various types of deception.004University of Surreyhttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.635560http://epubs.surrey.ac.uk/807037/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Vartapetiance, Anna
Computational approaches for verbal deception detection
description Deception exists in all aspects of life and is particularly evident on the Web. Deception includes child sexual predators grooming victims online, medical news headlines with little medical evidence or scientific rigour, individuals claiming others’ work as their own, and systematic deception of company shareholders and institutional investors leading to corporate collapses. This thesis explores the potential for automatic detection of deception. We investigate the nature of deception and the related cues, focusing in particular on Verbal Cues, and concluding that they cannot be readily generalised. We demonstrate how deception-specific features, based on sound hypotheses, can overcome related limitations by presenting approaches for three different examples of deception – namely Child Sexual Predator Detection (SPD), Authorship Identification (AI) and Intrinsic Plagiarism Detection (IPD). We further show how our approaches result in competitive levels of reliability. For SPD we develop our approach largely based on the commonality of requests for key personal information. To address AI, we introduce approaches based on a frequency-mean-variance and a frequency-only framework in order to detect strong associations between co-occurring patterns of a limited number of stopwords. Our IPD approaches are based on simple commonality of words at document level and usage of proper nouns; document sections lacking commonality can be identified as plagiarised. The frameworks of the International Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN) competitions provided an independent evaluation of the approaches. The SPD approach obtained an F1 score of 0.48. F1 scores of 0.47, 0.53 and 0.57 were achieved in AI tasks for PAN2012, 2013 and 2014 respectively. IPD yielded an overall accuracy of 91%. Through post-competition adaptations we also show how to improve the approaches and the scores and demonstrate the importance of suitable datasets and how most approaches are not easily transferable between various types of deception.
author2 Gillam, L.
author_facet Gillam, L.
Vartapetiance, Anna
author Vartapetiance, Anna
author_sort Vartapetiance, Anna
title Computational approaches for verbal deception detection
title_short Computational approaches for verbal deception detection
title_full Computational approaches for verbal deception detection
title_fullStr Computational approaches for verbal deception detection
title_full_unstemmed Computational approaches for verbal deception detection
title_sort computational approaches for verbal deception detection
publisher University of Surrey
publishDate 2015
url https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.635560
work_keys_str_mv AT vartapetianceanna computationalapproachesforverbaldeceptiondetection
_version_ 1718995567744385024