Summary: | In Classical test theory, difficulty ($p$) and discrimination ($d$) are two item coefficients that are widely used to analyze and validate items in educational testing. However, test items are usually affected by missing data (MD), and little is known about the effect of methods for handling MD on these two coefficients. The current study compares several simple substitution (imputation) strategies for dichotomous items to better understand their impact on item difficulty and discrimination. We conducted a simulation study, followed by the analysis of a real data set of test items from a language test. Based on the root mean square errors (RMSE), person mean (PM) is the best overall replacement method for difficulty $p$ and discrimination $d$. However, the analysis of bias coefficients and the analysis of real data show many similarities between most of the methods investigated to compute $p$ while multiple imputation (MI) and complete cases (CC) seem to be the least biased methods to compute $d$.
|