Search papers, labs, and topics across Lattice.
1
0
3
LLMs can be aggressively quantized to W(1+1)A4 without significant performance degradation using a surprisingly simple three-stage distillation approach.