Search papers, labs, and topics across Lattice.
5
0
5
9
Converting noisy, human-centric guides into self-evolving agent skills can yield performance improvements of up to 25.3 percentage points across diverse tasks.
Span-level error localization can boost deep-research agent reliability by up to 30 percentage points, revealing critical insights into where agents go wrong.
TVIR-Agent reveals that integrating visual elements into report generation can dramatically improve the quality and reliability of analytical outputs.
LLM agent distillation leads to surprisingly high rates of behavioral mimicry, with some student models exhibiting tool-use habits *more* similar to their teachers than the teacher's own family members.
Evaluating web coding LLMs with real-world fidelity reveals that even state-of-the-art models still struggle with aesthetics and framework-specific nuances.