Search papers, labs, and topics across Lattice.
2
29
4
3
A single model now rivals specialized vision-language models in understanding, while also generating and editing images, thanks to a unified discrete diffusion framework.
GPT-4o now has open-source competition: Ming-Omni matches its modality support in a single, unified model capable of perception and generation across image, text, audio, and video.