Search papers, labs, and topics across Lattice.
2
0
5
4
Existing document OCR models fail to preserve crucial structural and executable properties of LaTeX, but TexOCR, trained with verifiable rewards, finally delivers compilable page-to-LaTeX reconstruction.
Training on SciMDR, a new 300K QA dataset synthesized from scientific papers, substantially boosts model performance on complex, document-level scientific reasoning tasks.