Search papers, labs, and topics across Lattice.
1
0
3
2
MXFP4 quantization just got a whole lot better: BATQuant recovers up to 96.43% of full-precision performance in LLMs and MLLMs, even under aggressive W4A4KV16 settings, by preventing outlier propagation across quantization blocks.