Search papers, labs, and topics across Lattice.
2
6
4
4
Even state-of-the-art text-to-image models like Qwen-Image can be significantly improved in structural fidelity and semantic alignment of rendered text using a novel RL strategy that rewards structural anomaly quantification.
Open-sourcing SAIL-VL2 gives the multimodal community a new SOTA vision-language model under 4B parameters, driven by innovations in data curation, progressive training, and sparse MoE architectures.