Search papers, labs, and topics across Lattice.
Case Western Reserve University
2
0
4
LLM agents struggle to generalize from experience to reusable skills, often performing worse than simply replaying past trajectories, revealing a critical gap in current abstraction methods.
Agent evaluation is bottlenecked by environment interaction overhead, but ACE-Bench slashes this by using static JSON files, enabling fast and reproducible training-time validation.