Search papers, labs, and topics across Lattice.
2
0
4
0
LLMs reason better when their uncertainty consistently decreases, paving the way for shorter, more accurate chain-of-thought reasoning.
A compact 0.9B multimodal model, GLM-OCR, achieves state-of-the-art document understanding by predicting multiple tokens at once, boosting decoding throughput without blowing up memory.