Search papers, labs, and topics across Lattice.
1
0
3
RL-trained LLM agents can get stuck in an "information self-locking" trap, failing to ask the right questions and internalize information, but a simple learning signal reallocation can break them out.