Search papers, labs, and topics across Lattice.
This paper presents a large-scale audit of hate speech moderation on Twitter (now X) across eight languages, finding that 80% of hateful tweets remain online after five months. Surprisingly, neither the severity nor visibility of hateful tweets significantly increased their likelihood of removal. Through simulations of human-AI moderation pipelines, the authors demonstrate that reducing user exposure to hate speech is economically feasible, suggesting that the persistence of hate speech is due to resource allocation choices rather than technical limitations.
Twitter's hate speech policies are failing, with hateful content no more likely to be removed than innocuous tweets, even when explicitly violent.
Online hate speech is associated with substantial social harms, yet it remains unclear how consistently platforms enforce hate speech policies or whether enforcement is feasible at scale. We address these questions through a global audit of hate speech moderation on Twitter (now X). Using a complete 24-hour snapshot of public tweets, we construct representative samples comprising 540,000 tweets annotated for hate speech by trained annotators across eight major languages. Five months after posting, 80% of hateful tweets remain online, including explicitly violent hate speech. Such tweets are no more likely to be removed than non-hateful tweets, with neither severity nor visibility increasing the likelihood of removal. We then examine whether these enforcement gaps reflect technical limits of large-scale moderation systems. While fully automated detection systems cannot reliably identify hate speech without generating large numbers of false positives, they effectively prioritize likely violations for human review. Simulations of a human-AI moderation pipeline indicate that substantially reducing user exposure to hate speech is economically feasible at a cost below existing regulatory penalties. These results suggest that the persistence of online hate cannot be explained by technical constraints alone but also reflects institutional choices in the allocation of moderation resources.