Search papers, labs, and topics across Lattice.
1
0
3
27
Achieve 40% better visual fidelity in multimodal face generation by deeply fusing text and spatial priors within a unified diffusion transformer.