Search papers, labs, and topics across Lattice.
University of California, Santa Barbara 2 Apple, USA 3 University of Washington 4 Independent Researcher, USA
Apple ML Research4
0
8
1
Realistic user simulation is now possible: Pare offers a framework that moves beyond flat tool-calling APIs to model stateful user interactions, enabling better evaluation of proactive agents.
Injecting demonstrations with a carefully annealed probability can drastically improve exploration in RLVR, even for tasks requiring novel reasoning or domain-specific knowledge.
Despite advances in multimodal models, they still struggle to understand spatial relationships from an egocentric perspective, as shown by a 37.66% performance gap on the new SAW-Bench benchmark.
Forget hand-crafted reward functions: CM2 uses checklists to train tool-using agents, outperforming SFT baselines by up to 12 points on key benchmarks.