Search papers, labs, and topics across Lattice.
3
0
6
27
LLMs exhibit a "Utopian bias" when simulating human behavior, converging towards an unrealistic "positive average person" and failing to capture individual differences and long-tail behaviors.
LLMs trained with reinforcement learning from verifiable rewards (RLVR) become overconfident in incorrect answers, but a simple fix鈥攄ecoupling reasoning and calibration objectives鈥攃an restore proper calibration without sacrificing accuracy.
By grounding reflection in the visual artifacts of presentation slides, DeepPresenter enables agents to iteratively refine presentations in a way that internal reasoning traces alone cannot.