Search papers, labs, and topics across Lattice.
This paper introduces a steganography-based attribution framework to trace AI-generated images used in harmful multimodal contexts. The framework embeds cryptographically signed identifiers into images using watermarking and employs a CLIP-based model for detecting harmful image-text combinations. Experiments show that wavelet-domain spread-spectrum watermarking is robust to distortions, and the multimodal detector achieves high accuracy (AUC-ROC of 0.99), enabling reliable attribution.
Spotting AI-generated harms requires more than just image analysis: this system links images to their source even when paired with harmful text, enabling accountability.
The rapid growth of generative AI has introduced new challenges in content moderation and digital forensics. In particular, benign AI-generated images can be paired with harmful or misleading text, creating difficult-to-detect misuse. This contextual misuse undermines the traditional moderation framework and complicates attribution, as synthetic images typically lack persistent metadata or device signatures. We introduce a steganography enabled attribution framework that embeds cryptographically signed identifiers into images at creation time and uses multimodal harmful content detection as a trigger for attribution verification. Our system evaluates five watermarking methods across spatial, frequency, and wavelet domains. It also integrates a CLIP-based fusion model for multimodal harmful-content detection. Experiments demonstrate that spread-spectrum watermarking, especially in the wavelet domain, provides strong robustness under blur distortions, and our multimodal fusion detector achieves an AUC-ROC of 0.99, enabling reliable cross-modal attribution verification. These components form an end-to-end forensic pipeline that enables reliable tracing of harmful deployments of AI-generated imagery, supporting accountability in modern synthetic media environments. Our code is available at GitHub: https://github.com/bli1/steganography