Seimo posėdžių stenogramų tekstynas autorystės nustatymo bei autoriaus profilio sudarymo tyrimams | Corpus of transcribed parliamentary speeches for authorship attribution and author profiling tasks

In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary s...

Full description

Bibliographic Details
Main Authors: Jurgita Kapočiūtė-Dzikienė, Andrius Utka, Ligita Šarkutė
Format: Article
Language:deu
Published: Vilnius University 2014-12-01
Series:Kalbotyra
Subjects:
Online Access:http://www.kalbotyra.flf.vu.lt/wp-content/uploads/2015/01/Kalbotyra_66_27_45.pdf