Search papers, labs, and topics across Lattice.
The paper introduces SHIFT, a training-free attack that defeats diffusion-based watermarks by disrupting the reverse diffusion trajectory. SHIFT leverages stochastic diffusion resampling to deflect the generative trajectory in latent space, decoupling the reconstructed image from the watermark. Experiments across nine watermarking methods demonstrate SHIFT achieves near-perfect attack success rates (95-100%) while maintaining semantic quality, highlighting a fundamental vulnerability in trajectory-dependent watermarking schemes.
Diffusion-based watermarks, thought to be secure, can be completely bypassed with a simple stochastic resampling trick that breaks trajectory reconstruction.
Diffusion-based watermarking methods embed verifiable marks by manipulating the initial noise or the reverse diffusion trajectory. However, these methods share a critical assumption: verification can succeed only if the diffusion trajectory can be faithfully reconstructed. This reliance on trajectory recovery constitutes a fundamental and exploitable vulnerability. We propose $\underline{\mathbf{S}}$tochastic $\underline{\mathbf{Hi}}$dden-Trajectory De$\underline{\mathbf{f}}$lec$\underline{\mathbf{t}}$ion ($\mathbf{SHIFT}$), a training-free attack that exploits this common weakness across diverse watermarking paradigms. SHIFT leverages stochastic diffusion resampling to deflect the generative trajectory in latent space, making the reconstructed image statistically decoupled from the original watermark-embedded trajectory while preserving strong visual quality and semantic consistency. Extensive experiments on nine representative watermarking methods spanning noise-space, frequency-domain, and optimization-based paradigms show that SHIFT achieves 95%--100% attack success rates with nearly no loss in semantic quality, without requiring any watermark-specific knowledge or model retraining.