Search papers, labs, and topics across Lattice.
1 Tencent PCG 2 Tencent CSIG 11email:
1
0
3
0
Unleashing the full potential of multimodal LLMs requires reasoning directly in the visual latent space, and this paper shows how to do it with stable policy optimization.