Search papers, labs, and topics across Lattice.
1
0
2
3
Subtracting the mean from activations unlocks stable FP4 training for LLMs, closing the performance gap with BF16 without complex spectral methods.