Search papers, labs, and topics across Lattice.
1
0
3
2
Squeeze 34% more decode speed out of your MoE model without sacrificing accuracy by intelligently budgeting expert activations.