Search papers, labs, and topics across Lattice.
Manuscript received ; revised . (Corresponding author: Wenlong Niu.)This work was supported by the Civil Aerospace Pre-research Project under Grant D040103. All authors are with the Key Laboratory of Electronics and Information Technology for Space Systems, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China. Weihua Gao is also with the School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China. (e-mail: gaoweihua22@mails.ucas.ac.cn; niuwenlong@nssc.ac.cn)
Tsinghua AI1
0
3
2
A compact 0.9B multimodal model, GLM-OCR, achieves state-of-the-art document understanding by predicting multiple tokens at once, boosting decoding throughput without blowing up memory.