Search papers, labs, and topics across Lattice.
2
0
5
RL agents can learn far more efficiently by incorporating group-level natural language feedback, achieving 2.2x sample efficiency gains in sparse-reward environments.
Unlock SOTA performance in long-horizon search tasks with REDSearcher, a framework that slashes the cost of training by strategically synthesizing complex tasks and boosting core LLM capabilities *before* reinforcement learning.