Summary: | The prospect of “big data” at once evokes optimistic views of an information-rich future and concerns about surveillance that adversely impacts our personal and private lives. This overview article explores the implications of big data in education, focusing by way of example on data generated by student writing. We have chosen writing because it presents particular complexities, highlighting the range of processes for collecting and interpreting evidence of learning in the era of computer-mediated instruction and assessment as well as the challenges. Writing is significant not only because it is central to the core subject area of literacy; it is also an ideal medium for the representation of deep disciplinary knowledge across a number of subject areas. After defining what big data entails in education, we map emerging sources of evidence of learning that separately and together have the potential to generate unprecedented amounts of data: machine assessments, structured data embedded in learning, and unstructured data collected incidental to learning activity. Our case is that these emerging sources of evidence of learning have significant implications for the traditional relationships between assessment and instruction. Moreover, for educational researchers, these data are in some senses quite different from traditional evidentiary sources, and this raises a number of methodological questions. The final part of the article discusses implications for practice in an emerging field of education data science, including publication of data, data standards, and research ethics.
|