Search papers, labs, and topics across Lattice.
2
0
5
4
Training a single model across text, images, video, 3D geometry, and hidden representations unlocks "Context Unrolling," where the model reasons across modalities to improve reasoning fidelity.
Open-source VQ-VA models just got a massive boost: a new dataset and benchmark close the gap with proprietary systems on visual question-visual answering.