Search papers, labs, and topics across Lattice.
Virginia Tech
2
0
5
PRIME reveals a crucial precursor to reward hacking that can predict and adapt to misalignment before it manifests, offering a new lens on alignment risks in RL systems.
LLM agents can be significantly improved by *removing* redundant and outdated skills from their skill banks, not just adding more.