Supervised Classification: The Naive Beyesian Returns to the Old Bailey

A few years back, William Turkel wrote a series of blog posts called A Naive Bayesian in the Old Bailey, which showed how one could use machine learning to extract interesting documents out of a digital archive. This tutorial is a kind of an update on that blog essay, with roughly the same data but...

Full description

Bibliographic Details
Main Author: Vilja Hulden
Format: Article
Language:English
Published: Editorial Board of the Programming Historian 2014-12-01
Series:The Programming Historian
Subjects:
Online Access:http://programminghistorian.org/lessons/naive-bayesian
Description
Summary:A few years back, William Turkel wrote a series of blog posts called A Naive Bayesian in the Old Bailey, which showed how one could use machine learning to extract interesting documents out of a digital archive. This tutorial is a kind of an update on that blog essay, with roughly the same data but a slightly different version of the machine learner. The idea is to show why machine learning methods are of interest to historians, as well as to present a step-by-step implementation of a supervised machine learner. This learner is then applied to the Old Bailey digital archive, which contains several centuries’ worth of transcripts of trials held at the Old Bailey in London. We will be using Python for the implementation.
ISSN:2397-2068