Search papers, labs, and topics across Lattice.
5
0
9
36
Text-to-video models can now learn geometrically consistent world dynamics via reinforcement learning, without expensive architectural changes.
Forget representational differences - the secret to better feed-forward 3D scene modeling lies in tackling five core design problems.
Counterintuitively, VLMs can achieve higher VQA accuracy by intentionally degrading visual inputs, suggesting that high-resolution details can act as noise that hinders reasoning.
LLMs can achieve 2.5x higher throughput and 10.7x KV memory reduction in long-context reasoning by compressing the KV cache using trigonometric functions derived from pre-RoPE query/key vector distributions.
The medical imaging AI community is being held back by a fragmented data landscape, but a new metadata-driven fusion paradigm offers a path to unlocking the power of foundation models.