Search papers, labs, and topics across Lattice.
VCIP, CS, Nankai University
2
0
5
2
Achieve up to 2.5X faster video object removal by focusing DiT computations only on the essential tokens dictated by the mask.
Forget expensive human feedback loops: a VLM-powered reward function can efficiently align image editing diffusion models with human preferences.