Search papers, labs, and topics across Lattice.
The paper introduces Bn-HIB, a new dataset of 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory, distinguishing inflammatory content from hate speech. To classify memes in this dataset, they propose MCFM (Multi-Modal Co-Attention Fusion Model), which uses a co-attention mechanism to fuse visual and textual features. Experiments demonstrate that MCFM outperforms state-of-the-art models on the Bn-HIB dataset, showing its effectiveness in detecting hate and inflammatory content in Bengali memes.
A new dataset of Bengali memes distinguishes inflammatory content from direct hate speech, revealing the unique challenges of detecting harmful content in low-resource, multimodal settings.
Internet memes have become a dominant form of expression on social media, including within the Bengali-speaking community. While often humorous, memes can also be exploited to spread offensive, harmful, and inflammatory content targeting individuals and groups. Detecting this type of content is excep- tionally challenging due to its satirical, subtle, and culturally specific nature. This problem is magnified for low-resource lan- guages like Bengali, as existing research predominantly focuses on high-resource languages. To address this critical research gap, we introduce Bn-HIB (Bangla Hate Inflammatory Benign), a novel dataset containing 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory. Significantly, Bn- HIB is the first dataset to distinguish inflammatory content from direct hate speech in Bengali memes. Furthermore, we propose the MCFM (Multi-Modal Co-Attention Fusion Model), a simple yet effective architecture that mutually analyzes both the visual and textual elements of a meme. MCFM employs a co-attention mechanism to identify and fuse the most critical features from each modality, leading to a more accurate classification. Our experiments show that MCFM significantly outperforms several state-of-the-art models on the Bn-HIB dataset, demonstrating its effectiveness in this nuanced task.Warning: This work contains material that may be disturbing to some audience members. Viewer discretion is advised.