Search papers, labs, and topics across Lattice.
2
0
4
0
By using optimal transport to guide cross-attention, SceneTransporter disentangles image patches and 3D latents, leading to more coherent and geometrically faithful 3D scene generation from single images.
Finally, a single model handles multi-modal video generation, inpainting, and editing at cinematic resolutions with synchronized audio, all while accepting diverse inputs like text, images, video clips, and audio references.