Search papers, labs, and topics across Lattice.
6
1
11
4
Today's visual generation models are often evaluated on the wrong things, leading to inflated performance claims that mask critical failures in spatial reasoning, temporal consistency, and causal understanding.
Current research agent benchmarks miss critical flaws, as MiroEval reveals that process quality is a reliable predictor of research outcome, and multimodal tasks expose weaknesses invisible to output-level metrics.
By verifying its reasoning steps both locally and globally, MiroThinker-H1 achieves state-of-the-art performance in complex research tasks, demonstrating the power of integrated verification for reliable multi-step problem solving.
LLMs can now directly model the generative reasoning process for scientific discovery, thanks to a complexity-breaking framework that reduces exponential search to logarithmic.
RLHF struggles with long contexts because the reward signal for *finding* the right information vanishes, but can be revived by directly rewarding the model for selecting relevant context.
MiroFlow leapfrogs existing LLM agent frameworks with its agent graph architecture, delivering state-of-the-art performance and robust execution across a diverse range of benchmarks.