Search papers, labs, and topics across Lattice.
Controllable 3D generation takes a leap forward with 3D-ReGen, a framework that leverages an initial 3D shape for tasks like enhancement and editing, outperforming existing methods.
Human motion generation gets a dose of reality: IAM shows that explicitly modeling body morphology and identity leads to more realistic and consistent movements.
Current egocentric video benchmarks miss the mark: EgoEverything uses human gaze to create questions that actually reflect how people behave, not just what they see.
A million videos with paired depth, camera pose, and 3D point tracks could unlock a new wave of 3D-aware video models.
Scaling robot learning with human data isn't a simple "more is better" equation; alignment with robot learning objectives is key.
Training 3D avatar diffusion models on millions of in-the-wild videos is now possible, thanks to a clever 3D tokenization and visibility-aware training strategy that overcomes partial observability.
Music-grounded video editing can now produce significantly more coherent timelines thanks to a novel global-local coordination mechanism that resolves cross-segment conflicts.
Stop avatars from looking like they're having a seizure: this method uses autoregressive prediction of appearance latents to create temporally stable and high-fidelity 3D Gaussian avatars.
Unlocking the secrets of viral video ads: a new MLLM framework reveals which initial moments hook viewers and drive conversions.
By surgically intervening in MLLM decoding, this work cuts hallucination rates without sacrificing descriptive quality, a feat prior methods struggled to achieve.