Search papers, labs, and topics across Lattice.
1
0
3
LLM agents can explore more effectively by retrieving and reasoning over off-policy step-level traces, leading to significant performance gains and faster training.