Search papers, labs, and topics across Lattice.
2
0
3
3
Forget black-box policies: CSRO uses LLMs to generate human-readable code policies in multi-agent RL, achieving performance competitive with traditional methods.
LLMs can autonomously discover novel MARL algorithms that outperform hand-designed baselines, revealing untapped potential in automated algorithm design.