Search papers, labs, and topics across Lattice.
1
8
3
2
Forget context window limits: this RL method uses LLM-generated summaries to train agents for long-horizon tasks, achieving higher success rates with less context.