Search papers, labs, and topics across Lattice.
Controllable 3D generation takes a leap forward with 3D-ReGen, a framework that leverages an initial 3D shape for tasks like enhancement and editing, outperforming existing methods.
Human motion generation gets a dose of reality: IAM shows that explicitly modeling body morphology and identity leads to more realistic and consistent movements.
Current egocentric video benchmarks miss the mark: EgoEverything uses human gaze to create questions that actually reflect how people behave, not just what they see.
A million videos with paired depth, camera pose, and 3D point tracks could unlock a new wave of 3D-aware video models.
Training 3D avatar diffusion models on millions of in-the-wild videos is now possible, thanks to a clever 3D tokenization and visibility-aware training strategy that overcomes partial observability.
Forget scaling laws: a large VLM strategically paired with a smaller model's reasoning tokens can rival the performance of a much larger, monolithic model.
Music-grounded video editing can now produce significantly more coherent timelines thanks to a novel global-local coordination mechanism that resolves cross-segment conflicts.
Unlocking the secrets of viral video ads: a new MLLM framework reveals which initial moments hook viewers and drive conversions.
By surgically intervening in MLLM decoding, this work cuts hallucination rates without sacrificing descriptive quality, a feat prior methods struggled to achieve.
Edit the bassline, drums, or other instruments of any song with this new open-source multi-stem music generation model.