A Robust Consistency Model of Crowd Workers in Text Labeling Tasks

Crowdsourcing is a popular human-based model to acquire labeled data. Despite its ability to generate huge amounts of labelled data at moderate costs, it is susceptible to low quality labels. This can happen through unintentional or intentional errors by the crowd workers. Consistency is an importan...

Full description

Bibliographic Details
Main Authors: Fattoh Alqershi, Muhammad Al-Qurishi, Mehmet Sabih Aksoy, Majed Alrubaian, Muhammad Imran
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9187781/
id doaj-609d5d99a7724039933a5ffeaa86a104
record_format Article
spelling doaj-609d5d99a7724039933a5ffeaa86a1042021-03-30T03:43:09ZengIEEEIEEE Access2169-35362020-01-01816838116839310.1109/ACCESS.2020.30227739187781A Robust Consistency Model of Crowd Workers in Text Labeling TasksFattoh Alqershi0https://orcid.org/0000-0002-9609-4472Muhammad Al-Qurishi1https://orcid.org/0000-0002-7594-7325Mehmet Sabih Aksoy2https://orcid.org/0000-0003-0118-9602Majed Alrubaian3https://orcid.org/0000-0002-9244-8341Muhammad Imran4https://orcid.org/0000-0002-6946-2591Department of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaCollege of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaDepartment of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaCollege of Computer and Information Sciences, King Saud University, Riyadh, Saudi ArabiaCollege of Applied Computer Science, King Saud University, Riyadh, Saudi ArabiaCrowdsourcing is a popular human-based model to acquire labeled data. Despite its ability to generate huge amounts of labelled data at moderate costs, it is susceptible to low quality labels. This can happen through unintentional or intentional errors by the crowd workers. Consistency is an important attribute of reliability. It is a practical metric that evaluates a crowd workers' reliability based on their ability to conform to themselves by yielding the same output when repeatedly given a particular input. Consistency has not yet been sufficiently explored in the literature. In this work, we propose a novel consistency model based on the pairwise comparisons method. We apply this model on unpaid workers. We measure the workers' consistency on tasks of labeling political text-based claims and study the effects of different duplicate task characteristics on their consistency. Our results show that the proposed model outperforms the current state-of-the-art models in terms of accuracy.https://ieeexplore.ieee.org/document/9187781/Crowdsourcingreliabilityconsistencytext labelingfake news
collection DOAJ
language English
format Article
sources DOAJ
author Fattoh Alqershi
Muhammad Al-Qurishi
Mehmet Sabih Aksoy
Majed Alrubaian
Muhammad Imran
spellingShingle Fattoh Alqershi
Muhammad Al-Qurishi
Mehmet Sabih Aksoy
Majed Alrubaian
Muhammad Imran
A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
IEEE Access
Crowdsourcing
reliability
consistency
text labeling
fake news
author_facet Fattoh Alqershi
Muhammad Al-Qurishi
Mehmet Sabih Aksoy
Majed Alrubaian
Muhammad Imran
author_sort Fattoh Alqershi
title A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
title_short A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
title_full A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
title_fullStr A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
title_full_unstemmed A Robust Consistency Model of Crowd Workers in Text Labeling Tasks
title_sort robust consistency model of crowd workers in text labeling tasks
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Crowdsourcing is a popular human-based model to acquire labeled data. Despite its ability to generate huge amounts of labelled data at moderate costs, it is susceptible to low quality labels. This can happen through unintentional or intentional errors by the crowd workers. Consistency is an important attribute of reliability. It is a practical metric that evaluates a crowd workers' reliability based on their ability to conform to themselves by yielding the same output when repeatedly given a particular input. Consistency has not yet been sufficiently explored in the literature. In this work, we propose a novel consistency model based on the pairwise comparisons method. We apply this model on unpaid workers. We measure the workers' consistency on tasks of labeling political text-based claims and study the effects of different duplicate task characteristics on their consistency. Our results show that the proposed model outperforms the current state-of-the-art models in terms of accuracy.
topic Crowdsourcing
reliability
consistency
text labeling
fake news
url https://ieeexplore.ieee.org/document/9187781/
work_keys_str_mv AT fattohalqershi arobustconsistencymodelofcrowdworkersintextlabelingtasks
AT muhammadalqurishi arobustconsistencymodelofcrowdworkersintextlabelingtasks
AT mehmetsabihaksoy arobustconsistencymodelofcrowdworkersintextlabelingtasks
AT majedalrubaian arobustconsistencymodelofcrowdworkersintextlabelingtasks
AT muhammadimran arobustconsistencymodelofcrowdworkersintextlabelingtasks
AT fattohalqershi robustconsistencymodelofcrowdworkersintextlabelingtasks
AT muhammadalqurishi robustconsistencymodelofcrowdworkersintextlabelingtasks
AT mehmetsabihaksoy robustconsistencymodelofcrowdworkersintextlabelingtasks
AT majedalrubaian robustconsistencymodelofcrowdworkersintextlabelingtasks
AT muhammadimran robustconsistencymodelofcrowdworkersintextlabelingtasks
_version_ 1724182952073494528