Search papers, labs, and topics across Lattice.
RewardFlow is introduced as an inversion-free framework for steering pre-trained diffusion and flow-matching models using multi-reward Langevin dynamics at inference. It unifies differentiable rewards for various aspects like semantic alignment and perceptual fidelity, and introduces a differentiable VQA-based reward for fine-grained semantic supervision. A prompt-aware adaptive policy dynamically modulates reward weights and step sizes based on semantic primitives extracted from the instruction, leading to state-of-the-art performance in image editing and compositional generation.
Forget GAN inversions – now you can steer diffusion models with a dynamically weighted soup of differentiable rewards, including a VQA-based reward for language-vision reasoning, and get SOTA image edits.
We introduce RewardFlow, an inversion-free framework that steers pretrained diffusion and flow-matching models at inference time through multi-reward Langevin dynamics. RewardFlow unifies complementary differentiable rewards for semantic alignment, perceptual fidelity, localized grounding, object consistency, and human preference, and further introduces a differentiable VQA-based reward that provides fine-grained semantic supervision through language-vision reasoning. To coordinate these heterogeneous objectives, we design a prompt-aware adaptive policy that extracts semantic primitives from the instruction, infers edit intent, and dynamically modulates reward weights and step sizes throughout sampling. Across several image editing and compositional generation benchmarks, RewardFlow delivers state-of-the-art edit fidelity and compositional alignment.