Welcome to The Experimental Humanities Collaborative Network

Latent Space: From Datasets to Digital Heritage

with Aarati Akkapeddi, Ananda Rutherford, and Anna Ridler
June 2, 2023 at 11:00 AM - 4:00 PM

Keynes Library

School of Arts
Birkbeck, University of London
43 Gordon Square

Issues of incomplete data, acquisition transparency, and problematic taxonomies are topics that arguably span both Archival Ethics and AI Ethics. The bias of major Machine Learning datasets like Imagenet is something that has been widely criticised, leading to debates over collection methodologies and consent. At the same time, in the past decade, many museums have undertaken mass digitisation efforts, producing large amounts of digital data. This event will look at how both digital collections and digital datasets bring up similar ethical themes such as:

  • The traces of often troublesome histories of taxonomy in the ways we categorise and classify digital data today
  • The subtle and not-so-subtle ways bias is embedded in image descriptions
  • Complexities of automating dataset/catalogue auditing (for example moving beyond keyword searches for finding and amending euphemistic, racist or sexist catalogue descriptions or data.)
  • Terms & Agreements: rethinking open access, collecting and consent from source-communities
  • The (human) labour involved in creating and maintaining digital data.
  • The ethical implications of using digital heritage as training data for Machine Learning.

This event will include two presentations, a workshop, and an open discussion:

  • Artist Aarati Akkapeddi will present their current creative exploration of the V&A’s collection and speak about the connections they find between datasets, collections and archives in their own practice.
  • Researcher Ananda Rutherford will discuss her work on Transforming Collections: Reimagining Art, Nation and Heritage at the Decolonising Arts Institute, UAL.
  • Artist Anna Ridler will lead a hands-on workshop looking at several canonical computer vision datasets. This workshop will try to shed light on what has gone into training recent large language models (obscured and opaque). Participants will start to build their own datasets, concentrating on words that are difficult to define, images that are hard to classify and languages that no longer exist.