Search papers, labs, and topics across Lattice.
Mondo Robotics, B trainable parameters, comparable to Qwen
1
0
3
7
By jointly modeling video dynamics and actions, DiT4DiT achieves 10x sample efficiency and 7x faster convergence in robot policy learning, showing that video generation can be a powerful scaling proxy.