Search papers, labs, and topics across Lattice.
1
0
3
Unleashing the full potential of multimodal LLMs requires reasoning directly in the visual latent space, and this paper shows how to do it with stable policy optimization.