Search papers, labs, and topics across Lattice.
3
23
5
5
VLMs are surprisingly brittle: swapping text for a semantically identical image tanks performance, but a new data curation method, LoMo, can fix it.
A 5B model just crushed the image generation and editing performance of models 5-16x larger, thanks to smarter feature fusion and a novel RL training strategy.
Ditch the clunky pipelines: SongGen generates complete songs from text in a single pass, offering unprecedented control over musical elements and voice cloning.