Search papers, labs, and topics across Lattice.
5
910
12
32
The medical imaging AI community is being held back by a fragmented data landscape, but a new metadata-driven fusion paradigm offers a path to unlocking the power of foundation models.
LLMs can now more accurately answer questions on complex documents thanks to a new system that understands layout and hierarchical relationships between document components.
A principled framework for General World Models reveals the limitations of current systems and the architectural requirements for future progress.
Current LLMs and VLMs struggle with multi-step reasoning in long videos, often failing to maintain temporal coherence and procedural validity, as revealed by a new benchmark of hour-long narratives.
Open-source multimodal models just leveled up: InternVL3 rivals closed-source titans like GPT-4o by pre-training vision and language together from the start.