The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items

In educational measurement, various methods have been proposed to infer student proficiency from the ratings of multiple items (e.g., essays) by multiple raters. However, suitable models quickly become numerically demanding or even unfeasible as separate latent variables are needed to account for lo...

Full description

Bibliographic Details
Main Authors: Molenaar Dylan, Uluman Müge, Tavşancıl Ezel, De Boeck Paul
Format: Article
Language:English
Published: De Gruyter 2021-01-01
Series:Open Education Studies
Subjects:
Online Access:https://doi.org/10.1515/edu-2020-0105
id doaj-66b3def9f6b9467f9e1b2493745e16b9
record_format Article
spelling doaj-66b3def9f6b9467f9e1b2493745e16b92021-09-22T06:13:06ZengDe GruyterOpen Education Studies2544-78312021-01-0131334810.1515/edu-2020-0105The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple ItemsMolenaar Dylan0Uluman Müge1Tavşancıl Ezel2De Boeck Paul3Department of Psychology, University of Amsterdam, The NetherlandsDepartment of Educational Measurement and Evaluation, Ankara University, TurkeyDepartment of Educational Measurement and Evaluation, Ankara University, TurkeyDepartment of Psychology, Ohio State University, USAIn educational measurement, various methods have been proposed to infer student proficiency from the ratings of multiple items (e.g., essays) by multiple raters. However, suitable models quickly become numerically demanding or even unfeasible as separate latent variables are needed to account for local dependencies between the ratings of the same response. Therefore, in the present paper we derive a flexible approach based on Thurstone’s law of categorical judgment. The advantage of this approach is that it can be fit using weighted least squares estimation which is computationally less demanding as compared to most of the previous approaches in the case of an increasing number of latent variables. In addition, the new approach can be applied using existing latent variable modeling software. We illustrate the model on a real dataset from the Trends in International Mathematics and Science Study (TIMMSS) comprising ratings of 10 items by 4 raters for 150 subjects. In addition, we compare the new model to existing models including the facet model, the hierarchical rater model, and the hierarchical rater latent class model.https://doi.org/10.1515/edu-2020-0105rating dataitem response theorylocal independencehierarchical rater model
collection DOAJ
language English
format Article
sources DOAJ
author Molenaar Dylan
Uluman Müge
Tavşancıl Ezel
De Boeck Paul
spellingShingle Molenaar Dylan
Uluman Müge
Tavşancıl Ezel
De Boeck Paul
The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
Open Education Studies
rating data
item response theory
local independence
hierarchical rater model
author_facet Molenaar Dylan
Uluman Müge
Tavşancıl Ezel
De Boeck Paul
author_sort Molenaar Dylan
title The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
title_short The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
title_full The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
title_fullStr The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
title_full_unstemmed The Hierarchical Rater Thresholds Model for Multiple Raters and Multiple Items
title_sort hierarchical rater thresholds model for multiple raters and multiple items
publisher De Gruyter
series Open Education Studies
issn 2544-7831
publishDate 2021-01-01
description In educational measurement, various methods have been proposed to infer student proficiency from the ratings of multiple items (e.g., essays) by multiple raters. However, suitable models quickly become numerically demanding or even unfeasible as separate latent variables are needed to account for local dependencies between the ratings of the same response. Therefore, in the present paper we derive a flexible approach based on Thurstone’s law of categorical judgment. The advantage of this approach is that it can be fit using weighted least squares estimation which is computationally less demanding as compared to most of the previous approaches in the case of an increasing number of latent variables. In addition, the new approach can be applied using existing latent variable modeling software. We illustrate the model on a real dataset from the Trends in International Mathematics and Science Study (TIMMSS) comprising ratings of 10 items by 4 raters for 150 subjects. In addition, we compare the new model to existing models including the facet model, the hierarchical rater model, and the hierarchical rater latent class model.
topic rating data
item response theory
local independence
hierarchical rater model
url https://doi.org/10.1515/edu-2020-0105
work_keys_str_mv AT molenaardylan thehierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT ulumanmuge thehierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT tavsancılezel thehierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT deboeckpaul thehierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT molenaardylan hierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT ulumanmuge hierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT tavsancılezel hierarchicalraterthresholdsmodelformultipleratersandmultipleitems
AT deboeckpaul hierarchicalraterthresholdsmodelformultipleratersandmultipleitems
_version_ 1717371842956623872