Search papers, labs, and topics across Lattice.
5
0
8
HiLoRA achieves superior personalization in federated learning by adaptively clustering clients based on LoRA subspace similarity, enabling targeted knowledge sharing and outperforming standard LoRA-based methods.
ColoDiff generates realistic colonoscopy videos with precise control over clinical attributes, offering a promising solution to data scarcity in intestinal disease diagnosis.
A new model, TAC, uses synthetic training data to achieve state-of-the-art audio and audio-visual reasoning by generating temporally grounded captions that can then be fed into LLMs.
A purely Transformer-based audio tokenizer, pre-trained on 3M hours of data, leapfrogs existing codecs and even enables a fully autoregressive TTS model to outperform cascaded systems.
Open-source MOVA lets you generate synchronized, high-quality video and audio—including realistic lip sync—without relying on closed-source systems.