Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs

We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from short...

Full description

Bibliographic Details
Main Authors: Helena Gómez-Adorno, Grigori Sidorov, David Pinto, Darnes Vilariño, Alexander Gelbukh
Format: Article
Language:English
Published: MDPI AG 2016-08-01
Series:Sensors
Subjects:
Online Access:http://www.mdpi.com/1424-8220/16/9/1374
id doaj-e4c4a6dfcde248eba0ee5a5b3c5c807c
record_format Article
spelling doaj-e4c4a6dfcde248eba0ee5a5b3c5c807c2020-11-24T22:17:54ZengMDPI AGSensors1424-82202016-08-01169137410.3390/s16091374s16091374Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic GraphsHelena Gómez-Adorno0Grigori Sidorov1David Pinto2Darnes Vilariño3Alexander Gelbukh4Instituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City 07738, MexicoInstituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City 07738, MexicoBenemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Av. San Claudio y 14 Sur, Puebla 72570, MexicoBenemérita Universidad Autónoma de Puebla, Facultad de Ciencias de la Computación, Av. San Claudio y 14 Sur, Puebla 72570, MexicoInstituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz S/N, Mexico City 07738, MexicoWe apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution.http://www.mdpi.com/1424-8220/16/9/1374integrated syntactic graphstextual patternsauthorship attributionauthorship verificationshortest paths walkssyntactic n-grams
collection DOAJ
language English
format Article
sources DOAJ
author Helena Gómez-Adorno
Grigori Sidorov
David Pinto
Darnes Vilariño
Alexander Gelbukh
spellingShingle Helena Gómez-Adorno
Grigori Sidorov
David Pinto
Darnes Vilariño
Alexander Gelbukh
Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
Sensors
integrated syntactic graphs
textual patterns
authorship attribution
authorship verification
shortest paths walks
syntactic n-grams
author_facet Helena Gómez-Adorno
Grigori Sidorov
David Pinto
Darnes Vilariño
Alexander Gelbukh
author_sort Helena Gómez-Adorno
title Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
title_short Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
title_full Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
title_fullStr Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
title_full_unstemmed Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
title_sort automatic authorship detection using textual patterns extracted from integrated syntactic graphs
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2016-08-01
description We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution.
topic integrated syntactic graphs
textual patterns
authorship attribution
authorship verification
shortest paths walks
syntactic n-grams
url http://www.mdpi.com/1424-8220/16/9/1374
work_keys_str_mv AT helenagomezadorno automaticauthorshipdetectionusingtextualpatternsextractedfromintegratedsyntacticgraphs
AT grigorisidorov automaticauthorshipdetectionusingtextualpatternsextractedfromintegratedsyntacticgraphs
AT davidpinto automaticauthorshipdetectionusingtextualpatternsextractedfromintegratedsyntacticgraphs
AT darnesvilarino automaticauthorshipdetectionusingtextualpatternsextractedfromintegratedsyntacticgraphs
AT alexandergelbukh automaticauthorshipdetectionusingtextualpatternsextractedfromintegratedsyntacticgraphs
_version_ 1725783925385592832