Search papers, labs, and topics across Lattice.
3
0
6
Today's best text-to-audio-video models may look and sound impressive, but they still struggle with basic physics, coherent speech, and even rendering text correctly.
VLMs struggle to simultaneously optimize for both logical accuracy and aesthetics when generating academic illustrations, a challenge that test-time scaling can significantly alleviate.
Achieve few-step trajectory-controllable video generation without sacrificing video quality or trajectory accuracy, outperforming prior distillation methods.