Search papers, labs, and topics across Lattice.
Shenzhen University
2
0
4
Achieve 50% parameter reduction in LLaMA-2-7B with minimal performance loss and no fine-tuning, thanks to a new global gating-based structured pruning method.
Achieve up to 5.48x speedup in merging proximity graph indexes for AKNN search by intelligently exploiting structural information, outperforming naive reconstruction by nearly 10x.