Search papers, labs, and topics across Lattice.
4
0
4
3
LLMs have "pure incorrectness" features that correlate with wrong answers but don't actually *cause* them, suggesting that simply identifying error-correlated activations isn't enough for effective intervention.
LLM agents can learn to cooperate far more efficiently by borrowing credit assignment techniques from classic multi-agent RL.
LLMs learn faster and perform better in decision-making tasks when rewarded for being uncertain, not just for succeeding.
Forget weighting preferences alone – this new method uses conformal prediction to directly quantify and leverage the reliability of the *answers* themselves, leading to more robust and data-efficient LLM alignment.