Search papers, labs, and topics across Lattice.
3
0
7
4
Training a single model across text, images, video, 3D geometry, and hidden representations unlocks "Context Unrolling," where the model reasons across modalities to improve reasoning fidelity.
Adversarial training can drastically improve the sample quality of existing flow-matching models, achieving FID improvements of over 4.5 points on ImageNet 256px.
Open-source VQ-VA models just got a massive boost: a new dataset and benchmark close the gap with proprietary systems on visual question-visual answering.