Search papers, labs, and topics across Lattice.
3
0
5
UMMs exhibit significant exposure bias in multi-turn interactions, revealing critical performance gaps that existing benchmarks overlook.
Unlock multimodal interleaved generation in existing vision-language models without large interleaved datasets using a novel reinforcement learning approach with hybrid rewards.
Achieve spatially precise image edits in complex scenes by explicitly reasoning about object positions in text *before* visual grounding.