Search papers, labs, and topics across Lattice.
3
0
6
Achieving long-range consistency in video generation is now feasible with a hierarchical token approach that balances detail and computational efficiency.
Forget finetuning video models for each robot: a single, action-free video world model can drive diverse robots when paired with a carefully designed inverse dynamics model.
Encoder-decoder architectures can beat decoder-only transformers in novel view synthesis, overturning conventional wisdom with a compute-optimal design (SVSM) that slashes training costs.