Search papers, labs, and topics across Lattice.
Xiaohongshu Inc.
3
0
4
2
Flow-matching transformers with latent multi-modal conditioning and self-reference can leapfrog existing virtual try-on methods in both visual fidelity and inference speed.
VLMs can be transformed into pixel-precise structural document parsing experts, achieving state-of-the-art OCR performance by enforcing syntactic validity and structural integrity through reinforcement learning.
Instruction-based image editing just got a whole lot better: FireRed-Image-Edit leapfrogs existing systems with a massive, meticulously curated dataset and a suite of training innovations.