Search papers, labs, and topics across Lattice.
Osaka University
2
0
6
Achieve 50% parameter reduction in LLaMA-2-7B with minimal performance loss and no fine-tuning, thanks to a new global gating-based structured pruning method.
Optimizing multilingual training? Shapley values reveal the hidden cross-lingual transfer effects that current scaling laws miss, leading to better language mixture ratios.