Search papers, labs, and topics across Lattice.
3
0
5
By treating camera pose as a unifying geometric representation, WorldCam achieves significantly improved action controllability and long-horizon 3D consistency in interactive gaming world models compared to prior video diffusion transformer approaches.
Imagine a world model that doesn't just dream up environments, but flawlessly renders a real city like Seoul, complete with text-prompted scenarios and diverse camera movements.
By explicitly aligning attention with external correspondences, CORAL significantly improves detail preservation in virtual try-on, addressing a key limitation of existing Diffusion Transformer methods.