Search papers, labs, and topics across Lattice.
Shenzhen University
2
0
4
Achieve 50% parameter reduction in LLaMA-2-7B with minimal performance loss and no fine-tuning, thanks to a new global gating-based structured pruning method.
LLMs can develop more consistent world models by predicting multiple tokens *and* anchoring those predictions to ground-truth hidden state trajectories, mitigating structural hallucinations.