Search papers, labs, and topics across Lattice.
University of Chinese Academy of Sciences, Chinese Academy of Sciences
1
0
3
Latent reasoning, previously unstable in RL, can now outperform explicit reasoning while using 3-4x shorter chains, thanks to a new method that stabilizes latent space exploration.