Factors Influencing Rater Consistency on a Mathematics Performance Assessment
碩士 === 國立屏東教育大學 === 教育心理與輔導學系碩士班 === 94 === The main purpose of this study was to investigate various factors influencing rater consistency on a mathematics performance assessment. The factors being studied were: types of scoring rubrics (holistic vs. analytic), the number of rating scales (4-point...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2005
|
Online Access: | http://ndltd.ncl.edu.tw/handle/59145875468362009567 |
id |
ndltd-TW-094NPTTC328004 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-094NPTTC3280042015-12-21T04:04:53Z http://ndltd.ncl.edu.tw/handle/59145875468362009567 Factors Influencing Rater Consistency on a Mathematics Performance Assessment 國小數學科實作評量評分者ㄧ致性相關因素探討 TSAI, CHENG-PIN 蔡正濱 碩士 國立屏東教育大學 教育心理與輔導學系碩士班 94 The main purpose of this study was to investigate various factors influencing rater consistency on a mathematics performance assessment. The factors being studied were: types of scoring rubrics (holistic vs. analytic), the number of rating scales (4-point vs. 7-point), the complexity of performance tasks (high vs. low complex tasks), and the familiarity of performance assessment scoring rubrics (exposed vs. not exposed to scoring rubrics). A quasi-experimental study was implemented. Seventy-eight sixth grade students from two classes in one elementary school participated in this study. Prior to the formal test, all students received three sessions of practices on mathematics performance tests, and of the two groups, one received a thorough explanation on the scoring rubrics that were used to assess their performances on the tests. Three constructed-response performance tasks were developed, and the tasks were then divided into high- and low-complex tasks according to their complexity of language usages and logic in problem solving. In addition, two types of scoring rubrics (holistic and analytic), both with 4-point and 7-point rating scales, were also developed by the researcher. Generalizability studies, Spearman rank-order correlations, and percentages of agreement among raters were used in studying the rater reliability. The major findings of this study were as follows: 1. In general, the largest variation of mathematics performance assessment scores was due to person-task interaction (p×t), followed by person (p), and person-rater-task interaction (p×r×t,e). 2. The results of p×t×r G-studies showed that rater variation (including r, p×r, and t×r) accounted for only a small proportion of total variance components regardless of which type of scoring rubrics, rating scales (4-point or 7-point), and complexness of the tasks. However, results obtained from p×r G-studies, Spearman rank-order correlations, and percentages of rater agreement showed that rater consistency was found to be higher for analytical scoring rubrics than for holistic scoring rubrics, for low-complex tasks than for high-complex tasks, for 7-point rating scale than for 4-point rating scale. Last, students’ familiarity with scoring rubrics did not affect rater consistency when low-complex tasks were scored, however, rater demonstrated higher consistency when assessing students who were familiar with the scoring rubrics and responding to the high-complex tasks. Based on the above findings, suggestions regarding scoring rubrics and the research issues for future studies were provided. 張麗麗 2005 學位論文 ; thesis 166 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立屏東教育大學 === 教育心理與輔導學系碩士班 === 94 === The main purpose of this study was to investigate various factors influencing rater consistency on a mathematics performance assessment. The factors being studied were: types of scoring rubrics (holistic vs. analytic), the number of rating scales (4-point vs. 7-point), the complexity of performance tasks (high vs. low complex tasks), and the familiarity of performance assessment scoring rubrics (exposed vs. not exposed to scoring rubrics).
A quasi-experimental study was implemented. Seventy-eight sixth grade students from two classes in one elementary school participated in this study. Prior to the formal test, all students received three sessions of practices on mathematics performance tests, and of the two groups, one received a thorough explanation on the scoring rubrics that were used to assess their performances on the tests.
Three constructed-response performance tasks were developed, and the tasks were then divided into high- and low-complex tasks according to their complexity of language usages and logic in problem solving. In addition, two types of scoring rubrics (holistic and analytic), both with 4-point and 7-point rating scales, were also developed by the researcher. Generalizability studies, Spearman rank-order correlations, and percentages of agreement among raters were used in studying the rater reliability.
The major findings of this study were as follows:
1. In general, the largest variation of mathematics performance assessment scores was due to person-task interaction (p×t), followed by person (p), and person-rater-task interaction (p×r×t,e).
2. The results of p×t×r G-studies showed that rater variation (including r, p×r, and t×r) accounted for only a small proportion of total variance components regardless of which type of scoring rubrics, rating scales (4-point or 7-point), and complexness of the tasks. However, results obtained from p×r G-studies, Spearman rank-order correlations, and percentages of rater agreement showed that rater consistency was found to be higher for analytical scoring rubrics than for holistic scoring rubrics, for low-complex tasks than for high-complex tasks, for 7-point rating scale than for 4-point rating scale. Last, students’ familiarity with scoring rubrics did not affect rater consistency when low-complex tasks were scored, however, rater demonstrated higher consistency when assessing students who were familiar with the scoring rubrics and responding to the high-complex tasks.
Based on the above findings, suggestions regarding scoring rubrics and the research issues for future studies were provided.
|
author2 |
張麗麗 |
author_facet |
張麗麗 TSAI, CHENG-PIN 蔡正濱 |
author |
TSAI, CHENG-PIN 蔡正濱 |
spellingShingle |
TSAI, CHENG-PIN 蔡正濱 Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
author_sort |
TSAI, CHENG-PIN |
title |
Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
title_short |
Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
title_full |
Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
title_fullStr |
Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
title_full_unstemmed |
Factors Influencing Rater Consistency on a Mathematics Performance Assessment |
title_sort |
factors influencing rater consistency on a mathematics performance assessment |
publishDate |
2005 |
url |
http://ndltd.ncl.edu.tw/handle/59145875468362009567 |
work_keys_str_mv |
AT tsaichengpin factorsinfluencingraterconsistencyonamathematicsperformanceassessment AT càizhèngbīn factorsinfluencingraterconsistencyonamathematicsperformanceassessment AT tsaichengpin guóxiǎoshùxuékēshízuòpíngliàngpíngfēnzhěyi1zhìxìngxiāngguānyīnsùtàntǎo AT càizhèngbīn guóxiǎoshùxuékēshízuòpíngliàngpíngfēnzhěyi1zhìxìngxiāngguānyīnsùtàntǎo |
_version_ |
1718155365776883712 |