Search papers, labs, and topics across Lattice.
1
0
3
4
LVLMs can now iteratively self-correct and reason about multi-modal instructions, achieving SOTA performance by dynamically fusing textual, visual, and contextual features.