Search papers, labs, and topics across Lattice.
The University of Hong Kong
4
0
9
LLMs can achieve state-of-the-art results on complex reasoning tasks with far fewer parameters by iteratively excavating and reasoning over external knowledge.
By pruning and quantizing the KV cache, XStreamVGGT achieves a remarkable 4.42x memory reduction and 5.48x speedup in streaming 3D reconstruction without sacrificing performance.
Train your LLMs and ViTs faster: a nonparametric teaching method cuts training time by up to 21% *without* sacrificing accuracy.
Training LLMs for efficient reasoning is best achieved by using easier prompts to ensure a dense positive reward signal, preventing undesirable length collapse.