Search papers, labs, and topics across Lattice.
1
0
2
Achieve state-of-the-art 4-bit LLM quantization accuracy with SERQ, a saliency-aware error reconstruction method that uses a single low-rank matrix, outperforming existing methods while reducing calibration complexity.