Performance of the Ebel standard-setting method for the spring 2019 Royal College of Physicians and Surgeons of Canada internal medicine certification examination consisting of multiple-choice questions

Purpose This study aimed to assess the performance of the Ebel standard-setting method for the spring 2019 Royal College of Physicians and Surgeons of Canada internal medicine certification examination consisting of multiple-choice questions. Specifically, the following parameters were evaluated: in...

Full description

Bibliographic Details
Main Authors: Jimmy Bourque, Haley Skinner, Jonathan Dupré, Maria Bacchus, Martha Ainslie, Irene W. Y. Ma, Gary Cole
Format: Article
Language:English
Published: Korea Health Insurance Licensing Examination Institute 2020-04-01
Series:Journal of Educational Evaluation for Health Professions
Subjects:
Online Access:http://www.jeehp.org/upload/jeehp-17-12.pdf
Description
Summary:Purpose This study aimed to assess the performance of the Ebel standard-setting method for the spring 2019 Royal College of Physicians and Surgeons of Canada internal medicine certification examination consisting of multiple-choice questions. Specifically, the following parameters were evaluated: inter-rater agreement, the correlations between Ebel scores and item facility indices, the impact of raters’ knowledge of correct answers on the Ebel score, and the effects of raters’ specialty on inter-rater agreement and Ebel scores. Methods Data were drawn from a Royal College of Physicians and Surgeons of Canada certification exam. The Ebel method was applied to 203 multiple-choice questions by 49 raters. Facility indices came from 194 candidates. We computed the Fleiss kappa and the Pearson correlations between Ebel scores and item facility indices. We investigated differences in the Ebel score according to whether correct answers were provided or not and differences between internists and other specialists using the t-test. Results The Fleiss kappa was below 0.15 for both facility and relevance. The correlation between Ebel scores and facility indices was low when correct answers were provided and negligible when they were not. The Ebel score was the same whether the correct answers were provided or not. Inter-rater agreement and Ebel scores were not significantly different between internists and other specialists. Conclusion Inter-rater agreement and correlations between item Ebel scores and facility indices were consistently low; furthermore, raters’ knowledge of the correct answers and raters’ specialty had no effect on Ebel scores in the present setting.
ISSN:1975-5937