The Voices of Old Bailey Defendants 1750-1900

These datasets have been created in order to explore the Voices of Authority research theme of the Digital Panopticon project. They cover a period in which the English criminal trial (for felonies) profoundly changed: in the mid-18th century few lawyers were involved and the trial was conceived as a direct encounter between victim as prosecutor, and the defendant who was expected to conduct their own defence. By 1900, lawyers were in charge, and trial procedures were increasingly bureaucratic. I want to explore what this meant for defendants.

Old Bailey Voices 1780-1880 (OBV)

OBV has been derived from two sources: the Old Bailey Corpus (version 2) (OBC) and the Old Bailey Online Proceedings (OBO). It contains data from all single defendant trials (21023 defendants) in 227 sessions of the Old Bailey Proceedings between 1780 and 1880 which have had linguistic markup added by the Old Bailey Corpus project.

For this project it was essential to correctly associate defendants with their spoken words (not a concern for OBC), as we intend to trace their histories and long-term outcomes using the Digital Panopticon’s record linkage. The difficulties involved in ensuring this was done correctly in trials with multiple defendants (in fact, it’s quite often impossible!), led to the decision to restrict this dataset to single-defendant trials.

The OBV data has three components:

  • a new version of the tagged speech data from OBC with some additional tagging and OBO defendant IDs.
  • summary data for each defendant (obv_defendants_trials.tsv) containing biographical and trial information
  • data extracted from OBC for scribes, publishers, printers and editors for each session

Get the OBV data

Defence statements 1751-1900

This dataset has been created from the Old Bailey Online XML data. It consists of all prisoner defence statements that could be identified in trials in the Old Bailey Proceedings between 1751 and 1900.

In addition to using OBO data as its direct source material, the dataset would not have been possible without the Old Bailey Corpus; the OBC linguistic tagging has made it possible to see speech data within the Proceedings in new ways. Thus, it became much clearer from OBC that (a) prisoner defences (and, indeed, prisoners’ statements more generally) are quite consistently labelled in the Proceedings text, and (b) defences tend to appear in certain places and forms that could be identified and extracted from the OBO data.

Structured data from OBO has then been added to the defence texts: all available defendant personal information (including gender and age), and offence, verdict and sentence categories, to make it possible to explore the effectiveness of particular kinds of defence narrative, or variations by gender or type of offence.

Get the defences data

If you want both datasets, you may find it more convenient to download them here

The data (unless otherwise specified) is released under a Creative Commons Attribution-ShareAlike 4.0 International Licence