Search papers, labs, and topics across Lattice.
This paper presents a systematic review of 59 publications on dataset documentation tools, analyzing the motivations, conceptualizations, and connections to existing systems. The review identifies four persistent patterns that hinder adoption and standardization: unclear value operationalizations, decontextualized designs, unaddressed labor demands, and deferred integration. The authors argue for a shift towards institutional Responsible AI tool design to enable sustainable documentation practices.
Dataset documentation tools are failing because they neglect the institutional context, labor demands, and clear value proposition of documentation efforts.
Dataset documentation is widely recognized as essential for the responsible development of automated systems. Despite growing efforts to support documentation through different kinds of artifacts, little is known about the motivations shaping documentation tool design or the factors hindering their adoption. We present a systematic review supported by mixed-methods analysis of 59 dataset documentation publications to examine the motivations behind building documentation tools, how authors conceptualize documentation practices, and how these tools connect to existing systems, regulations, and cultural norms. Our analysis shows four persistent patterns in dataset documentation conceptualization that potentially impede adoption and standardization: unclear operationalizations of documentation's value, decontextualized designs, unaddressed labor demands, and a tendency to treat integration as future work. Building on these findings, we propose a shift in Responsible AI tool design toward institutional rather than individual solutions, and outline actions the HCI community can take to enable sustainable documentation practices.