Search papers, labs, and topics across Lattice.
V) setting. Figure 6: Fine-grained performance comparison of evaluated models in the Reference-to-Video (R
3
0
6
AI-driven scientific discovery is closer than you think, but current systems still struggle with reproducibility, cross-domain robustness, and accountable scientific closure.
Current video generation benchmarks miss the forest for the trees: EvalVerse actually measures cinematic quality, not just prompt adherence.
MLLMs can learn to reason more faithfully by explicitly anchoring visual attention to relevant image regions and reinforcing the use of that evidence during reasoning via counterfactual interventions.