Search papers, labs, and topics across Lattice.
2
0
4
Video generation models don't reason frame-by-frame as previously thought; instead, they explore multiple solutions during diffusion denoising and progressively converge, revealing a "Chain-of-Steps" mechanism.
A 1000x larger video reasoning dataset reveals early signs of emergent generalization, offering a new foundation for training and evaluating spatiotemporal AI.