Search papers, labs, and topics across Lattice.
This paper introduces Mod-Guide, a content moderation feedback system that enhances large language models (LLMs) to better recognize and address culturally insensitive speech directed at Bangladesh's Hindu and Chakma communities. By co-creating a culturally grounded corpus of insensitive speech with community members and integrating their narratives into moderation pipelines via retrieval augmented generation (RAG), the authors significantly improve the contextual accuracy of moderation responses. Evaluations reveal that RAG-enhanced responses are perceived more positively across ethnic lines, highlighting the importance of incorporating minority perspectives in AI systems.
RAG-enhanced moderation can transform how LLMs address culturally insensitive speech, improving accuracy and community trust.
Language operates as a mechanism of both marginalization and resistance, especially for minority communities navigating insensitive and harmful speech online. As content moderation increasingly depends on large language models (LLMs), concerns arise about whether these systems can recognize culturally insensitive speech-language that disregards or marginalizes the cultural and religious perspectives of historically underrepresented communities, often through implicit erasure, misrepresentation, or normative framing, rather than overt hostility. Focusing on Bangladesh's Hindu and Chakma communities -- the country's largest religious and Indigenous ethnic minorities, respectively -- this paper investigates the epistemic limits of LLM-based moderation systems and explores methods for incorporating minority perspectives. We co-created a culturally grounded corpus of insensitive speech with community members and integrated their narratives into moderation pipelines using retrieval augmented generation (RAG). Our tool, Mod-Guide, improves LLM sensitivity to minority viewpoints by leveraging contextual cues derived from lived experience. Through mixed-method evaluations involving both minority and majority participants, we demonstrate that RAG-enhanced moderation responses are more contextually accurate and perceived differently across ethnic lines. This work advances research in human-computer interaction, AI ethics, and social computing by foregrounding restorative justice and hermeneutical inclusion in the design of content moderation systems.