Search papers, labs, and topics across Lattice.
NLPR & MAIS, Institute of Automation, Chinese Academy of Sciences
2
0
5
2
A principled framework for General World Models reveals the limitations of current systems and the architectural requirements for future progress.
Overconfident errors in RLVR monopolize probability mass and suppress exploration, but a confidence-aware penalty fixes this and boosts mathematical reasoning performance.