Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations
Main Author: | |
---|---|
Language: | English |
Published: |
The Ohio State University / OhioLINK
2013
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=osu1367505135 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu1367505135 |
---|---|
record_format |
oai_dc |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Educational Evaluation Educational Technology Science Education automated scoring machine learning assessment explanation evolution natural selection |
spellingShingle |
Educational Evaluation Educational Technology Science Education automated scoring machine learning assessment explanation evolution natural selection Ha, Minsu Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
author |
Ha, Minsu |
author_facet |
Ha, Minsu |
author_sort |
Ha, Minsu |
title |
Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
title_short |
Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
title_full |
Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
title_fullStr |
Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
title_full_unstemmed |
Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations |
title_sort |
assessing scientific practices using machine learning methods: development of automated computer scoring models for written evolutionary explanations |
publisher |
The Ohio State University / OhioLINK |
publishDate |
2013 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=osu1367505135 |
work_keys_str_mv |
AT haminsu assessingscientificpracticesusingmachinelearningmethodsdevelopmentofautomatedcomputerscoringmodelsforwrittenevolutionaryexplanations |
_version_ |
1719419285797863424 |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu13675051352021-08-03T05:23:05Z Assessing Scientific Practices Using Machine Learning Methods: Development of Automated Computer Scoring Models for Written Evolutionary Explanations Ha, Minsu Educational Evaluation Educational Technology Science Education automated scoring machine learning assessment explanation evolution natural selection Although multiple-choice assessment formats are commonly utilized throughout the educational hierarchy, they are only capable of measuring a small subset of important disciplinary competencies and practices. Consequently, science educators require open-response format assessments that can validly measure more advanced skills and performances (e.g., producing written scientific explanations). However, open-response format assessments are not practical in many educational contexts because of the high cost of scoring, the delayed feedback to test-takers, and the lack of scoring consistency among human graders. In this study, the efficacy of automated computer scoring (ACS) of written explanations is examined relative to human scoring. This study aims to build ACS models using machine-learning methods in order to detect a suite of scientific and naive ideas in written scientific explanations, and to explore approaches for optimizing these ACS models. This study develops and evaluates nine machine-learning models to detect six scientific concepts and three naive ideas of natural selection. In addition, it examines the effects of three machine-learning parameters (i.e., n-gram selection, stop words, and misclassified data) on the performance of the ACS models. In order to test the efficacy of the ACS models, a corpus of 10,270 written evolutionary explanations--in response to a variety of items differing in surface features was gathered. The corpus was scored by expert human raters and by the ACS models, and four correspondence measures were calculated: kappa, raw agreement, precision, and recall. Methodologically, the ACS models were built using the SMO (Sequential Minimal Optimization) algorithm in the LightSIDE software. Repeated-measures ANOVAs, Pearson correlations, and logarithmic regressions were used to examine the effects of the three machine learning parameters on human-computer correspondence measures, and to examine the effects of sample size on model performance. The results indicated that human-computer correspondence was robust, with kappa measures of 0.880 for variation, 0.848 for heritability, 0.962 for competition, 0.957 for limited resources, 0.843 for differential survival/reproduction, 0.968 for non-adaptive ideas, 0.845 for needs/goals, 0.776 for use/disuse, and 0.732 for adapt/acclimation. Pearson correlations between human and computer scoring were also robust (0.96 for scientific ideas and 0.90 for naive ideas). Analyses of the differential effects of the three machine-learning parameters indicated that `bigrams’ were helpful in differentiating ambiguous key words (e.g., longer). The analyses also indicated that using `stop words’ improved particular ACS models (e.g., needs/goals). Removing misclassified data was shown to be helpful in building more accurate ACS models for complex concepts. The best-performing ACS models exhibited concept-specific machine-learning parameters. Moreover, as the number of human-scored inputs used to build the ACS models increased, the kappa, agreement, precision and recall increased logarithmically. Overall, this study found that the efficacy of ACS model prediction was enhanced by targeted model settings for each concept, and by human intuition about the structure of scientific language. Collectively, this study found that concept-specific settings for ACS models can produce scores comparable to those of trained human raters, and that they offer great promise in the field of science education for evaluating the composition and quality of written evolutionary explanations. 2013-08-27 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1367505135 http://rave.ohiolink.edu/etdc/view?acc_num=osu1367505135 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |