Search papers, labs, and topics across Lattice.
2
0
6
0
LLM agents struggle to maintain performance in multi-day collaborative tasks, dropping significantly after just one environmental update, revealing a critical gap in adaptation to evolving real-world conditions.
Ditch unimodal policies: flow-based policies combined with distributional RL unlock SOTA performance on MuJoCo by capturing complex, multimodal return distributions.