Search papers, labs, and topics across Lattice.
The paper introduces RefVFX, a framework for transferring complex temporal visual effects from a reference video to a target video or image in a feed-forward manner. To train the model, the authors created a large-scale dataset of video triplets using a novel automated pipeline that preserves input motion while applying repeatable effects, augmented with LoRA-derived and programmatically generated data. Experiments demonstrate that RefVFX generalizes to unseen effects, produces temporally coherent edits, and outperforms text-prompt baselines.
Forget tedious prompt engineering – RefVFX lets you copy and paste visual effects between videos with a single reference clip.
We present RefVFX, a new framework that transfers complex temporal effects from a reference video onto a target video or image in a feed-forward manner. While existing methods excel at prompt-based or keyframe-conditioned editing, they struggle with dynamic temporal effects such as dynamic lighting changes or character transformations, which are difficult to describe via text or static conditions. Transferring a video effect is challenging, as the model must integrate the new temporal dynamics with the input video's existing motion and appearance. % To address this, we introduce a large-scale dataset of triplets, where each triplet consists of a reference effect video, an input image or video, and a corresponding output video depicting the transferred effect. Creating this data is non-trivial, especially the video-to-video effect triplets, which do not exist naturally. To generate these, we propose a scalable automated pipeline that creates high-quality paired videos designed to preserve the input's motion and structure while transforming it based on some fixed, repeatable effect. We then augment this data with image-to-video effects derived from LoRA adapters and code-based temporal effects generated through programmatic composition. Building on our new dataset, we train our reference-conditioned model using recent text-to-video backbones. Experimental results demonstrate that RefVFX produces visually consistent and temporally coherent edits, generalizes across unseen effect categories, and outperforms prompt-only baselines in both quantitative metrics and human preference. See our website at https://tuningfreevisualeffects-maker.github.io/Tuning-free-Visual-Effect-Transfer-across-Videos-Project-Page/