Search papers, labs, and topics across Lattice.
2
0
5
0
LLMs reason better when their uncertainty consistently decreases, paving the way for shorter, more accurate chain-of-thought reasoning.
Mobile agents trained with RL struggle to generalize to new app interfaces, improving only 8.3% compared to supervised baselines, despite a 26.1% gain on unseen task instances.