Search papers, labs, and topics across Lattice.
6
0
9
0
Finally, you can puppeteer photorealistic 3D head avatars with independent control over identity, articulation, and nuanced emotional expression.
MLLMs can now see time: Delta-LLaVA unlocks precise change detection in remote sensing by explicitly modeling temporal relationships, outperforming existing models on a new challenging benchmark.
Real-time, lightweight image compression is now possible with diffusion models, thanks to a novel architecture that swaps transformers for convolutions and prioritizes compression-focused pre-training.
Stop MLLM hallucinations in VideoQA: ClueNet's two-stage training and adaptive clue filtering boosts accuracy by 1.1% while improving interpretability and efficiency.
Ditching the 2D latent grid unlocks 60%+ bitrate reductions in generative video compression by encoding videos into adaptable 1D latent tokens.
Agentic RL can now beat proprietary LLMs and torch.compile in the challenging domain of CUDA kernel generation, achieving up to 40% speedups on hard tasks.