Search papers, labs, and topics across Lattice.
This position paper argues that model collapse, arising from training generative models on the outputs of prior models, poses a significant threat to AI democratization efforts. It highlights how model collapse exacerbates data degradation, reinforces cultural biases, and leads to inefficient resource use, disproportionately impacting low-resource and marginalized communities. The paper calls for action to mitigate these effects, emphasizing both environmental and cultural implications.
Model collapse isn't just a technical problem; it's a threat to AI democratization that will widen the gap between high- and low-resource communities.
Model collapse, the degradation in performance that arises when generative models are trained on the outputs of prior models, is an increasing concern as artificially generated content proliferates. Related critiques of large language models have highlighted their tendency to reproduce frequent patterns in training data, their reliance on vast datasets, and their substantial environmental cost. Together, these factors contribute to data degradation, the reinforcement of cultural biases, and inefficient resource use. In this position paper we aim to combine these views and argue that model collapse threatens current efforts to democratize AI. By reducing training efficiency and skewing data distributions away from the tails of their support, model collapse disproportionately impacts low-resource and marginalized communities. We examine both the environmental and cultural implications of this phenomenon, situate our position within recent position papers on model collapse, and conclude with a call to action. Finally, we outline initial directions for mitigating these effects.