Generalizability Analysis of Constructed-Response and Hands-On Performance Tasks in Home Economics

碩士 === 國立臺南大學 === 測驗統計研究所碩士班 === 99 === Owing to the new high school adimission policy in Taiwan, not only traditional objective tests but also alternative assessments in all disciplines, even the subject of home economics, are valued by students and parents. Considering the nature of home economics...

Full description

Bibliographic Details
Main Authors: Chiao-ying Wu, 吳巧吟
Other Authors: Huey-ing Tzou
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/55859534686113428622
Description
Summary:碩士 === 國立臺南大學 === 測驗統計研究所碩士班 === 99 === Owing to the new high school adimission policy in Taiwan, not only traditional objective tests but also alternative assessments in all disciplines, even the subject of home economics, are valued by students and parents. Considering the nature of home economics, performance tasks are broadly used in classroom assessments. Thus, the fairness and consistency of teachers’ scoring becomes a big issue of academic performance in junior high schools. To understand the generaliability of scores derived from performance tasks, the study designed two constructed-responses tasks and two hands-on tasks in home economis for 7th grades. Both univariate and multivariate generalizability theories were used to examine the score generalizability which is associated with task and rater facets based on three rating conditions—all raters scoring each task without rubrics, with rubrics, and after receiving rater training. The sutdy collected 384 task samples of 96 students from four classes. They were from 4 different classes taught by 4 raters. Under each rating condition, the 4 raters marked all 4 tasks of 8 students from each class. The results showed that the rater consistency under the condition of grading tasks after receiving rater training performed better than the other two conditons. The variance components of task were much larger in the conditions of both without and with rubrics than in the condition of receiving rater training. Furthermore, the D-study showed that the most efficient rating way was one rater with two tasks in all of three conditions. Besides, the results of t-test revealed that there was no difference between the ratings marked by their own teacher and other teachers. Given the results, the study suggested that a feasible and ideal performance assessment should be composed of three parts: (1) at least 2 tasks, (2) scoring by instructing teacher, (3) scoring after receive rater training.