Impact of differential item functioning on statistical conclusions

Differential item functioning (DIF), sometimes called item bias, has been widely studied in educational and psychological measurement; however, to date, research has focused on the definitions of, and the methods for, detecting DIF. It is well accepted that the presence of DIF may degrade the validi...

Full description

Bibliographic Details
Main Author:	Li, Zhen
Language:	English
Published:	University of British Columbia 2009
Online Access:	http://hdl.handle.net/2429/14680

id	ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-14680
record_format	oai_dc
spelling	ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.2429-146802014-03-26T03:36:40Z Impact of differential item functioning on statistical conclusions Li, Zhen Differential item functioning (DIF), sometimes called item bias, has been widely studied in educational and psychological measurement; however, to date, research has focused on the definitions of, and the methods for, detecting DIF. It is well accepted that the presence of DIF may degrade the validity of a test. There is relatively little known, however, about the impact of DIF on later statistical decisions when one uses the observed test scores in data analyses and corresponding statistical hypothesis tests. This dissertation investigated the impact of DIF on later statistical decisions based on the observed total test (or scale) score. Very little is known in the literature about the impact of DIF on the Type I error rate and effect size of, for instance, the independent samples t-test on the observed total test scores. Five studies were conducted: studies one to three investigated the impact of unidirectional DIF (i.e., DIF amplification) on the Type I error rate and effect size of the independent samples t-test; studies four and five investigated the DIF cancellation effects on the Type I error rate and effect size of the independent samples t-test. The Type I error rate and effect size were defined in terms of latent population means rather than observed sample means. The results showed that the amplification and cancellation effects among uniform DIF items did transfer to the test level. Both the Type I error rate and effect size were inflated. The degree of inflation depends on the number of DIF items, magnitude of DIF, sample sizes, and interactions among these factors. These findings highlight the importance of screening DIF before conducting any further statistical analysis. It offers advice to practicing researchers about when and how much the presence of DIF will affect their statistical conclusions based on the total observed test scores. 2009-11-05T21:04:48Z 2009-11-05T21:04:48Z 2009 2009-11-05T21:04:48Z 2010-05 Electronic Thesis or Dissertation http://hdl.handle.net/2429/14680 eng University of British Columbia
collection	NDLTD
language	English
sources	NDLTD
description	Differential item functioning (DIF), sometimes called item bias, has been widely studied in educational and psychological measurement; however, to date, research has focused on the definitions of, and the methods for, detecting DIF. It is well accepted that the presence of DIF may degrade the validity of a test. There is relatively little known, however, about the impact of DIF on later statistical decisions when one uses the observed test scores in data analyses and corresponding statistical hypothesis tests. This dissertation investigated the impact of DIF on later statistical decisions based on the observed total test (or scale) score. Very little is known in the literature about the impact of DIF on the Type I error rate and effect size of, for instance, the independent samples t-test on the observed total test scores. Five studies were conducted: studies one to three investigated the impact of unidirectional DIF (i.e., DIF amplification) on the Type I error rate and effect size of the independent samples t-test; studies four and five investigated the DIF cancellation effects on the Type I error rate and effect size of the independent samples t-test. The Type I error rate and effect size were defined in terms of latent population means rather than observed sample means. The results showed that the amplification and cancellation effects among uniform DIF items did transfer to the test level. Both the Type I error rate and effect size were inflated. The degree of inflation depends on the number of DIF items, magnitude of DIF, sample sizes, and interactions among these factors. These findings highlight the importance of screening DIF before conducting any further statistical analysis. It offers advice to practicing researchers about when and how much the presence of DIF will affect their statistical conclusions based on the total observed test scores.
author	Li, Zhen
spellingShingle	Li, Zhen Impact of differential item functioning on statistical conclusions
author_facet	Li, Zhen
author_sort	Li, Zhen
title	Impact of differential item functioning on statistical conclusions
title_short	Impact of differential item functioning on statistical conclusions
title_full	Impact of differential item functioning on statistical conclusions
title_fullStr	Impact of differential item functioning on statistical conclusions
title_full_unstemmed	Impact of differential item functioning on statistical conclusions
title_sort	impact of differential item functioning on statistical conclusions
publisher	University of British Columbia
publishDate	2009
url	http://hdl.handle.net/2429/14680
work_keys_str_mv	AT lizhen impactofdifferentialitemfunctioningonstatisticalconclusions
_version_	1716655192845320192

Impact of differential item functioning on statistical conclusions

Similar Items