Search papers, labs, and topics across Lattice.
1
0
3
Ditch slow, error-prone autoregressive video captioning: diffusion models can now generate captions in parallel, rivaling autoregressive quality with a significant speed boost.