Search papers, labs, and topics across Lattice.
7
0
9
15
A dedicated guard agent, trained via reasoning-intensive methods, can effectively neutralize prompt injection attacks in web-navigating agents without sacrificing performance.
Forget hand-coded adaptation rules: Meta-TTL learns policies that let language agents self-improve at test time, generalizing zero-shot to unseen environments.
Current research agent benchmarks miss critical flaws, as MiroEval reveals that process quality is a reliable predictor of research outcome, and multimodal tasks expose weaknesses invisible to output-level metrics.
LLM agents can achieve 3x faster web search and higher accuracy by dynamically routing between multiple context management strategies.
LLMs struggle to maintain consistent personalization as conversations lengthen and preferences become less explicit, suggesting current models fall short of truly adaptive personal assistants.
A 106B model can beat a 1T model on long-horizon reasoning tasks, thanks to a novel training pipeline that distills knowledge from research papers and uses trajectory-splitting SFT and progressive RL.
Self-evolving LLM agents can be persistently compromised by injecting malicious payloads into their long-term memory, turning them into "zombie agents" that execute unauthorized actions across sessions.