Search papers, labs, and topics across Lattice.
4
0
6
6
LLM agents struggle to maintain performance in multi-day collaborative tasks, dropping significantly after just one environmental update, revealing a critical gap in adaptation to evolving real-world conditions.
Forget textual rules and coarse embeddings: a multimodal reward model that directly compares rendered visuals unlocks significant gains in vision-to-code RL.
Hallucinations in RL-based image editing and generation are tamed with FIRM, a new framework that trains robust reward models on curated datasets to provide more accurate guidance.
A 5B model just crushed the image generation and editing performance of models 5-16x larger, thanks to smarter feature fusion and a novel RL training strategy.