Search papers, labs, and topics across Lattice.
The University of Hong Kong
2
0
5
Patents overselling their innovation actually face a *penalty* in evaluation, decreasing their chances of being granted, transferred, or successfully appealed.
Multi-turn RL agents can learn far more effectively by explicitly monitoring and controlling uncertainty at both the token and turn levels, leading to more stable training and higher performance.