Search papers, labs, and topics across Lattice.
2
0
4
VLMs can be transformed into pixel-precise structural document parsing experts, achieving state-of-the-art OCR performance by enforcing syntactic validity and structural integrity through reinforcement learning.
Instruction-based image editing just got a whole lot better: FireRed-Image-Edit leapfrogs existing systems with a massive, meticulously curated dataset and a suite of training innovations.