Search papers, labs, and topics across Lattice.
5
1
9
3
Current research agent benchmarks miss critical flaws, as MiroEval reveals that process quality is a reliable predictor of research outcome, and multimodal tasks expose weaknesses invisible to output-level metrics.
By verifying its reasoning steps both locally and globally, MiroThinker-H1 achieves state-of-the-art performance in complex research tasks, demonstrating the power of integrated verification for reliable multi-step problem solving.
LLMs can now directly model the generative reasoning process for scientific discovery, thanks to a complexity-breaking framework that reduces exponential search to logarithmic.
RLHF struggles with long contexts because the reward signal for *finding* the right information vanishes, but can be revived by directly rewarding the model for selecting relevant context.
MiroFlow leapfrogs existing LLM agent frameworks with its agent graph architecture, delivering state-of-the-art performance and robust execution across a diverse range of benchmarks.