Summary: | Authorship studies has, over the last two decades, absorbed a number of quantitative methods only made possible through the use of computers. The New Oxford Shakespeare Authorship Companion presents a number of studies that utilize such methods, including some based in machine learning or “deep learning” models.
This paper focuses on the specific application of three such methods in Jack Elliott and Brett Greatley-Hirsch’s “Arden of Faversham and the Print of Many.” It finds that their attribution of the authorship of Arden to William Shakespeare is suspect under all three such methods: Delta, Nearest Shrunken Centroid, and Random Forests. The underlying models do not sufficiently justify the attributions, the data provided are insufficiently specific, and the internals of the methods are too opaque to bear up to scrutiny. This article attempts to depict the internal flaws of the methods, with a particular focus on Nearest Shrunken Centroid.
These methodological flaws arguably arise in part from a lack of rigor, but also from an impoverished treatment of the available data, focusing exclusively on comparative word frequencies within and across authors. A number of potentially fruitful directions that authorship studies are suggested, that could increase the robustness and accuracy of quantitative methods, as well as warn of the potential limits of such methods.
|