Summary: | Written skills are an essential evaluation criterion for a student’s creativity, knowledge, and intellect. Consequently, academic writing is a common part of university and college admissions applications, standardized tests, and classroom assessments. However, the task for teachers is quite daunting when it comes to essay scoring. Then Automated Essay Scoring may be a helpful tool in the decision-making by the teacher. There have been many successful models with supervised or unsupervised machine learning algorithms in the eld of Automated Essay Scoring. This thesis work makes a comparative study among various neural network models with supervised machine learning algorithms and different linguistic feature combinations. It also proves that the same linguistic features are applicable to more than one language. The models studied in this experiment include TextCNN, TextRNN_LSTM, Tex- tRNN_GRU, and TextRCNN trained with the essays from the Automated Student Assessment Prize (ASAP) from Kaggle competitions. Each essay is represented with linguistic features measuring linguistic complexity. Those features are divided into four groups: count-based, morphological, syntactic, and lexical features, and the four groups of features can form a total of 14 combinations. The models are evaluated via three measurements: Accuracy, F1 score, and Quadratic Weighted Kappa. The experimental results show that models trained only with count-based features outperform the models trained using other feature combinations. In addition, TextRNN_LSTM performs best, with an accuracy of 54.79%, an F1 score of 0.55, and a Quadratic Weighted Kappa of 0.59, which beats the statistically-based baseline models.
|