Search papers, labs, and topics across Lattice.
Tencent
2
0
5
DFlare achieves up to 5.52x speedup in LLM inference by allowing draft layers to independently leverage richer target knowledge, breaking through previous capacity constraints.
GLM-5 doesn't just code; it engineers, showcasing unprecedented capability in tackling end-to-end software engineering challenges.