Welcome to The Experimental Humanities Collaborative Network
Latent Space: From Datasets to Digital Heritage
with Aarati Akkapeddi, Ananda Rutherford, and Anna Ridler
Faculty-led
June 2, 2023 at 11:00 AM - 4:00 PM
Keynes Library
School of Arts Birkbeck, University of London 43 Gordon Square London
Issues of incomplete data, acquisition transparency, and problematic taxonomies are topics that arguably span both Archival Ethics and AI Ethics. The bias of major Machine Learning datasets like Imagenet is something that has been widely criticised, leading to debates over collection methodologies and consent. At the same time, in the past decade, many museums have undertaken mass digitisation efforts, producing large amounts of digital data. This event will look at how both digital collections and digital datasets bring up similar ethical themes such as:
The traces of often troublesome histories of taxonomy in the ways we categorise and classify digital data today
The subtle and not-so-subtle ways bias is embedded in image descriptions
Complexities of automating dataset/catalogue auditing (for example moving beyond keyword searches for finding and amending euphemistic, racist or sexist catalogue descriptions or data.)
Terms & Agreements: rethinking open access, collecting and consent from source-communities
The (human) labour involved in creating and maintaining digital data.
The ethical implications of using digital heritage as training data for Machine Learning.
This event will include two presentations, a workshop, and an open discussion:
Artist Aarati Akkapeddi will present their current creative exploration of the V&A’s collection and speak about the connections they find between datasets, collections and archives in their own practice.
Artist Anna Ridler will lead a hands-on workshop looking at several canonical computer vision datasets. This workshop will try to shed light on what has gone into training recent large language models (obscured and opaque). Participants will start to build their own datasets, concentrating on words that are difficult to define, images that are hard to classify and languages that no longer exist.