Search papers, labs, and topics across Lattice.
2
0
5
1
A 4B parameter model can now beat much larger models at social reasoning, thanks to a new RL framework that aligns model reasoning trajectories with human cognition.
LLM agents can now proactively protect user privacy with a new reinforcement learning approach that outperforms static defenses by 14% while maintaining helpfulness.