Search papers, labs, and topics across Lattice.
4
0
8
By "imagining" new scenarios and asking "What if this were the true preference?", CRED actively designs environments and trajectories to expose differences between competing reward functions, dramatically improving preference learning.
A 3B model can match the performance of models more than twice its size in mobile GUI automation by distilling visual history into concise natural language summaries.
Forget opaque LLM-driven memory policies: A-MAC offers transparent, efficient control over LLM agent memory by factoring in utility, confidence, novelty, recency, and content type.
Unleashing the full reasoning potential of VLMs, AgentM3D adaptively scales test-time reasoning paths to achieve state-of-the-art zero-shot multi-modal misinformation detection.