Search papers, labs, and topics across Lattice.
This ethnographic study investigates a civic-tech initiative that creates safety datasets through a collaborative approach, emphasizing the involvement of those most affected by online harms. By applying a reparative justice framework, the authors highlight the challenges of achieving equitable compensation for data work and collective governance of AI datasets. The findings suggest that reorienting data work towards accountability and repair can fundamentally reshape the relationship between humans, datasets, and AI systems, promoting a more responsible and inclusive data production process.
Rethinking data work through a reparative justice lens reveals that accountability, not just algorithms, should be at the heart of AI safety efforts.
We present an ethnographic study of an alternative approach to data work, developed by a civic-tech initiative that builds datasets for training and benchmarking online safety systems. They aim to respond to online safety concerns from a feminist perspective, by building safety datasets collaboratively with those most impacted by online harms. In this paper, we examine how this approach aims to reorient data work as a site for repair and redress, and trace the struggles they encounter in the process. Specifically, we draw attention to the challenges and tensions involved in advancing just reward for data work and collective governance of AI datasets. Examining these challenges through an STS-informed lens of reparative justice and repair, we argue that the work of repairing data work (and AI) lies, fundamentally, in resetting the ties of accountability. At a time heightened emphasis on efforts like safety evaluations and red teaming to make AI more responsible, we highlight the need to confront foundational questions about how the humans involved in these efforts relate to the datasets and systems they help produce. A reparative lens demands that we interrupt prevailing norms of data work and place at their centre, not AI or datasets, but those most harmed by the neglect, oversight and exclusion animated in the current modes of dataset production. This, we argue, offers a bold vision for responsibility and contributes towards a critical agenda for building alternative futures of data and AI practice.