Summary: | This thesis addresses the relationship between vocabulary measures and IELTS ratings. The research questions focus on the relationship between measures of lexical richness and teacher ratings. The specific question the thesis seeks to address is: Which measures of lexical richness are the best for predicting the ratings? This question has been considered central in vocabulary measurement research for the last decades particularly in relation to IELTS, one of the most popular exams in the world. Therefore, if a model can predict IELTS scores by using vocabulary measures it could be used as a predictive tool by teachers and researchers worldwide. The research was carried out through two studies, Study 1 and Study 2 and then the model was tested through a third smaller study. Study 1 was a small pilot study which looked at both oral and written data. Study 2 focused on written data only. Measures of both lexical diversity and sophistication were chosen for both studies. Both studies followed similar methodologies with the addition of an extra variable in the second study. For the first study data was collected from 42 IELTS learners whereas for the second study an existing corpus was used. The measures investigated in both studies were: Tokens, TTR, D, Guiraud, Types, Guiraud Advanced and P_Lex. The first four are measures of lexical diversity, the other three measures of lexical sophistication. However, all of the previous measures are measures of breadth of vocabulary. For the second study, a measure of formulaic count was added. This is an aspect of depth of vocabulary used to check if results would improve with this addition. Formulaic sequences were counted in each essay by using Martinez and Schmitt’s (2012) PHRASE List of the 505 most frequent non-transparent multiword expressions in English. The main findings show that all the measures correlate with the ratings but Tokens has the highest correlation of all lexical diversity measures, and Types has the highest correlation of all lexical sophistication measures. TTR, Guiraud and P_Lex can explain 52.8% of the variability in the Lexical ratings. In addition, holistic ratings can be predicted by the same two lexical diversity measures (TTR and Guiraud) but with a different measure of lexical sophistication, Guiraud Advanced. The model consisting of these three measures can explain 49.2% of the variability in the holistic ratings. The formulaic count did not seem to improve the model’s predictive validity, but further analysis from a qualitative angle seemed to explain this behaviour. In Study 3, the holistic ratings model was tested using a small sample of real IELTS data and the examiners comments’ were used for a more qualitative analysis. This revealed that the model underestimated the scores since the range of ratings from the IELTS data was wider than the range of the data from Study 2 which were used as the basis for the model. This proved to be a major hindrance to the study. However, the qualitative analysis confirmed the argument that vocabulary accounts for a high percentage of variance in ratings and provided insights to other aspects that may influence raters which could be added to the model in future research. The issues and limitations of the study and the current findings contribute to the field by stimulating further research into producing a predictive tool that could inform students of their predicted rating before they decide to take the IELTS exam. This could have potential financial benefits for students.
|