Search papers, labs, and topics across Lattice.
This paper introduces CHASM, a novel dataset aimed at evaluating Multimodal Large Language Models (MLLMs) for detecting covert advertisements on Chinese social media, specifically targeting the platform Rednote. The study highlights the inadequacy of current MLLMs in reliably identifying these deceptive posts, revealing that even with fine-tuning, significant challenges remain in detecting subtle cues and variances in content structure. The findings underscore the urgent need for enhanced detection mechanisms to address the ethical and legal implications of covert advertising in social media environments.
Current MLLMs fail to detect covert advertisements, revealing a critical gap in social media moderation that could mislead consumers and pose ethical risks.
Current benchmarks for evaluating large language models (LLMs) in social media moderation completely overlook a serious threat: covert advertisements, which disguise themselves as regular posts to deceive and mislead consumers into making purchases, leading to significant ethical and legal concerns. In this paper, we present the CHASM, a first-of-its-kind dataset designed to evaluate the capability of Multimodal Large Language Models (MLLMs) in detecting covert advertisements on social media. CHASM is a high-quality, anonymized, manually curated dataset consisting of 4,992 instances, based on real-world scenarios from the Chinese social media platform Rednote. The dataset was collected and annotated under strict privacy protection and quality control protocols. It includes many product experience sharing posts that closely resemble covert advertisements, making the dataset particularly challenging.The results show that under both zero-shot and in-context learning settings, none of the current MLLMs are sufficiently reliable for detecting covert advertisements.Our further experiments revealed that fine-tuning open-source MLLMs on our dataset yielded noticeable performance gains. However, significant challenges persist, such as detecting subtle cues in comments and differences in visual and textual structures.We provide in-depth error analysis and outline future research directions. We hope our study can serve as a call for the research community and platform moderators to develop more precise defenses against this emerging threat.