Search papers, labs, and topics across Lattice.
This paper introduces SYNCRED-Bench, a comprehensive benchmark designed to evaluate the detection of synthetic credibility in AI-generated visual misinformation, comprising 600 images categorized into six credible forms and seven circulation styles. The evaluation reveals that existing detection systems, including 15 MLLMs and open-source AIGC detectors, perform poorly, with true positive rates as low as 10.5% and under 5%, respectively, while commercial APIs achieve only 57.6%. Human annotators also face challenges, achieving a mere 63% true positive rate, highlighting the urgent need for improved detection methods in the face of this emerging misinformation threat.
Existing detection systems fail to reliably identify synthetic credibility, with MLLMs achieving only a 10.5% true positive rate under stringent conditions.
Recent generative models can now produce visual artifacts with realistic embedded text and layouts, creating a new misinformation threat: synthetic credibility. We introduce SYNCRED-Bench, a benchmark of 600 AI-generated misinformation images balanced across six credible-form categories and seven fine-grained circulation styles, together with FP450, a real-image negative set for measuring false positives. Extensive evaluation shows that existing systems remain unreliable: under a 5% false-positive-rate constraint, 15 MLLMs achieve only 10.5% true positive rate (TPR), open-source AIGC detectors achieve less than 5%, and commercial APIs reach 57.6%. Human annotators also struggled to identify synthetic credibility, reaching only 63% TPR. These findings establish synthetic credibility as a severe and underexplored visual misinformation challenge, and provide a benchmark for developing detectors that reason beyond superficial credibility cues.