Jun 4, 2026arXiv:2606.06228

SAM-Flow: Source-Anchored Masked Flow for Training-Free Image Editing

Haowang Cui, Rui Chen, Tao Luo, Tao Guo, Zheng Qin, Jiaze Wang

AI Summary

This paper introduces SAM-Flow, a source-anchored masked flow framework designed for localized training-free image editing, addressing the issue of background leakage common in existing methods. By utilizing a scout image and token-grounded attention maps, SAM-Flow localizes editable regions and applies differential velocity updates selectively, anchoring non-target areas to the source-image latent trajectory. The method enhances spatial stability and boundary naturalness through a novel time-varying projection mechanism, achieving superior semantic editing and background preservation without the need for fine-tuning.

Key Contribution

Background leakage in image editing can be effectively mitigated by SAM-Flow's localized approach, ensuring high fidelity in semantic modifications.

Abstract

Training-free image editing has recently attracted increasing attention due to its ability to modify real images using powerful pre-trained diffusion and flow-matching models without additional training. However, existing inversion-based and differential-flow-based methods usually perform global latent transport, which inevitably propagates editing effects to non-target regions and leads to background leakage. To address this problem, we propose SAM-Flow, a source-anchored masked flow framework for localized training-free image editing. Instead of updating the whole latent representation, SAM-Flow first uses a scout image and token-grounded attention maps to localize the editable semantic regions. It then applies differential velocity updates only within these regions, while anchoring the remaining areas to the source-image latent trajectory. To further improve spatial stability and boundary naturalness, we introduce a time-varying source-anchored projection mechanism with dynamic soft masks, transition regions, and temporal mask accumulation. The proposed method is plug-and-play and can be integrated with mainstream flow-matching backbones such as Stable Diffusion 3 and FLUX without any fine-tuning. Extensive qualitative and quantitative experiments demonstrate that SAM-Flow achieves accurate semantic editing while significantly improving background preservation, providing a simple and general localized editing paradigm for training-free image editing. Code is available at: https://github.com/chwbob/Sam-Flow.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SAM-Flow: Source-Anchored Masked Flow for Training-Free Image Editing

Related Papers