Search papers, labs, and topics across Lattice.
Zhejiang University, Westlake University
2
0
3
4
A single model now rivals specialized vision-language models in understanding, while also generating and editing images, thanks to a unified discrete diffusion framework.
A 5B model just crushed the image generation and editing performance of models 5-16x larger, thanks to smarter feature fusion and a novel RL training strategy.