Development and Validation of a Vertical Scale for Formative Assessment in Mathematics

The regular formative assessment of students' abilities across multiple school grades requires a reliable and valid vertical scale. A vertical scale is a precondition not only for comparing assessment results and measuring progress over time, but also for identifying the most informative items...

Full description

Bibliographic Details
Main Authors:	Stéphanie Berger, Angela J. Verschoor, Theo J. H. M. Eggen, Urs Moser
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2019-10-01
Series:	Frontiers in Education
Subjects:	vertical scaling item calibration item response theory curriculum validation
Online Access:	https://www.frontiersin.org/article/10.3389/feduc.2019.00103/full

id	doaj-193f1dbcea354be2a1e83245f496dfff
record_format	Article
spelling	doaj-193f1dbcea354be2a1e83245f496dfff2020-11-25T01:42:16ZengFrontiers Media S.A.Frontiers in Education2504-284X2019-10-01410.3389/feduc.2019.00103458216Development and Validation of a Vertical Scale for Formative Assessment in MathematicsStéphanie Berger0Stéphanie Berger1Angela J. Verschoor2Theo J. H. M. Eggen3Theo J. H. M. Eggen4Urs Moser5Department of Research Methodology, Measurement, and Data Analysis, University of Twente, Enschede, NetherlandsInstitute for Educational Evaluation, University of Zurich, Zurich, SwitzerlandCito, Institute for Educational Measurement, Arnhem, NetherlandsDepartment of Research Methodology, Measurement, and Data Analysis, University of Twente, Enschede, NetherlandsCito, Institute for Educational Measurement, Arnhem, NetherlandsInstitute for Educational Evaluation, University of Zurich, Zurich, SwitzerlandThe regular formative assessment of students' abilities across multiple school grades requires a reliable and valid vertical scale. A vertical scale is a precondition not only for comparing assessment results and measuring progress over time, but also for identifying the most informative items for each individual student within a large item bank independent of the student's grade to increase measurement efficiency. However, the practical implementation of a vertical scale is psychometrically challenging. Several extant studies point to the complex interactions between the practical context in which the scale is used and the scaling decisions that researchers need to make during the development of a vertical scale. As a consequence, clear general recommendations are missing for most scaling decisions. In this study, we described the development of a vertical scale for the formative assessment of third- through ninth-grade students' mathematics abilities based on item response theory methods. We evaluated the content-related validity of this new vertical scale by contrasting the calibration procedure's empirical outcomes (i.e., the item difficulty estimates) with the theoretical, content-related item difficulties reflected by the underlying competence levels of the curriculum, which served as a content framework for developing the scale. Besides analyzing the general match between empirical and content-related item difficulty, we also explored, by means of correlation and multiple regression analyses, whether the match differed for items related to different curriculum cycles (i.e., primary vs. secondary school), domains, or competencies within mathematics. The results showed strong correlations between the empirical and content-related item difficulties, which emphasized the scale's content-related validity. Further analysis showed a higher correlation between empirical and content-related item difficulty at the primary compared with the secondary school level. Across the different curriculum domains and most of the curriculum competencies, we found comparable correlations, implying that the scale is a good indicator of the math ability stated in the curriculum.https://www.frontiersin.org/article/10.3389/feduc.2019.00103/fullvertical scalingitem calibrationitem response theorycurriculumvalidation
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Stéphanie Berger Stéphanie Berger Angela J. Verschoor Theo J. H. M. Eggen Theo J. H. M. Eggen Urs Moser
spellingShingle	Stéphanie Berger Stéphanie Berger Angela J. Verschoor Theo J. H. M. Eggen Theo J. H. M. Eggen Urs Moser Development and Validation of a Vertical Scale for Formative Assessment in Mathematics Frontiers in Education vertical scaling item calibration item response theory curriculum validation
author_facet	Stéphanie Berger Stéphanie Berger Angela J. Verschoor Theo J. H. M. Eggen Theo J. H. M. Eggen Urs Moser
author_sort	Stéphanie Berger
title	Development and Validation of a Vertical Scale for Formative Assessment in Mathematics
title_short	Development and Validation of a Vertical Scale for Formative Assessment in Mathematics
title_full	Development and Validation of a Vertical Scale for Formative Assessment in Mathematics
title_fullStr	Development and Validation of a Vertical Scale for Formative Assessment in Mathematics
title_full_unstemmed	Development and Validation of a Vertical Scale for Formative Assessment in Mathematics
title_sort	development and validation of a vertical scale for formative assessment in mathematics
publisher	Frontiers Media S.A.
series	Frontiers in Education
issn	2504-284X
publishDate	2019-10-01
description	The regular formative assessment of students' abilities across multiple school grades requires a reliable and valid vertical scale. A vertical scale is a precondition not only for comparing assessment results and measuring progress over time, but also for identifying the most informative items for each individual student within a large item bank independent of the student's grade to increase measurement efficiency. However, the practical implementation of a vertical scale is psychometrically challenging. Several extant studies point to the complex interactions between the practical context in which the scale is used and the scaling decisions that researchers need to make during the development of a vertical scale. As a consequence, clear general recommendations are missing for most scaling decisions. In this study, we described the development of a vertical scale for the formative assessment of third- through ninth-grade students' mathematics abilities based on item response theory methods. We evaluated the content-related validity of this new vertical scale by contrasting the calibration procedure's empirical outcomes (i.e., the item difficulty estimates) with the theoretical, content-related item difficulties reflected by the underlying competence levels of the curriculum, which served as a content framework for developing the scale. Besides analyzing the general match between empirical and content-related item difficulty, we also explored, by means of correlation and multiple regression analyses, whether the match differed for items related to different curriculum cycles (i.e., primary vs. secondary school), domains, or competencies within mathematics. The results showed strong correlations between the empirical and content-related item difficulties, which emphasized the scale's content-related validity. Further analysis showed a higher correlation between empirical and content-related item difficulty at the primary compared with the secondary school level. Across the different curriculum domains and most of the curriculum competencies, we found comparable correlations, implying that the scale is a good indicator of the math ability stated in the curriculum.
topic	vertical scaling item calibration item response theory curriculum validation
url	https://www.frontiersin.org/article/10.3389/feduc.2019.00103/full
work_keys_str_mv	AT stephanieberger developmentandvalidationofaverticalscaleforformativeassessmentinmathematics AT stephanieberger developmentandvalidationofaverticalscaleforformativeassessmentinmathematics AT angelajverschoor developmentandvalidationofaverticalscaleforformativeassessmentinmathematics AT theojhmeggen developmentandvalidationofaverticalscaleforformativeassessmentinmathematics AT theojhmeggen developmentandvalidationofaverticalscaleforformativeassessmentinmathematics AT ursmoser developmentandvalidationofaverticalscaleforformativeassessmentinmathematics
_version_	1725037557863940096

Development and Validation of a Vertical Scale for Formative Assessment in Mathematics

Similar Items