Search papers, labs, and topics across Lattice.
5
0
7
18
Training video generation models to explicitly infer latent physical properties yields more physically plausible videos than simply scaling data and model size.
Forget GAN inversions – now you can steer diffusion models with a dynamically weighted soup of differentiable rewards, including a VQA-based reward for language-vision reasoning, and get SOTA image edits.
Hallucinations in 3D embodied agents can be significantly reduced at inference time by contrasting predictions under original and geometrically/semantically perturbed 3D scene graphs.
Text-to-3D generation gets a semantic upgrade: DreamPartGen creates 3D objects with parts that not only look right but also understand their relationships and align with textual descriptions.
Achieve SOTA multimodal performance across eight benchmarks and strong zero-shot generalization without task-specific training by decoupling understanding and generation via unified discrete flow matching.