Search papers, labs, and topics across Lattice.
The paper introduces LEGO, a novel framework for synthetic image detection that addresses the limitations of existing methods by focusing on generator-specific artifacts. LEGO uses LoRA modules, each pretrained on a single-generator dataset, to capture unique artifacts, followed by an MLP and attention layers for dynamic feature fusion. Results demonstrate that LEGO achieves state-of-the-art performance with significantly less training data and fewer epochs compared to existing methods, while also offering better generalization and adaptability to new generators.
LEGO's modular design lets you detect deepfakes with 10x less training data and far fewer epochs, all by focusing on the unique fingerprints of each image generator.
The rapid advancement of generative technologies has made synthetic images nearly indistinguishable from real ones, thereby creating an urgent need for robust detectors to counter misinformation. However, existing methods mainly rely on universal artifact features that are shared across multiple generators. We observe that as the diversity of generators increases, the overlap of these common features gradually decreases. This severely undermines model generalization. In contrast, focusing only on unique artifacts tends to cause overfitting to specific forgery patterns. To address this challenge, we propose LEGO (LoRA-Enabled Generator-Oriented Framework). The core mechanism of LEGO employs an MLP to modulate multiple LoRA (Low-Rank Adaptation) blocks, each pretrained to capture the unique artifacts of a specific generator, followed by attention-based feature fusion. Unlike conventional methods that seek a single universal solution, LEGO delegates unique artifact extraction to specialized LoRA modules by dividing its training procedure into two stages. Each LoRA module is individually trained on a single-generator dataset to learn generator-specific representations, then MLP and attention layers are trained on mixed datasets to dynamically regulate the contribution of each module. Benefiting from its modular yet robust design, LEGO can be naturally extended by incorporating new LoRA modules for adaptation to newly emerging next-generation datasets, while still achieving substantially better performance than prior SOTA methods with fewer than 30,000 training images, less than 10% of their training data, and only 5 epochs in each training stage.