Search papers, labs, and topics across Lattice.
This paper introduces Geometric-Aware Quantization (GAQ), a novel quantization framework designed to preserve SO(3) equivariance in low-bit GNNs for molecular simulations. GAQ employs Magnitude-Direction Decoupled Quantization (MDDQ) to separately quantize invariant lengths and equivariant orientations, along with a symmetry-aware training strategy and robust attention normalization. Experiments on rMD17 show that W4A8 GAQ models achieve FP32 accuracy while significantly reducing Local Equivariance Error and enabling substantial speedups and memory reduction on consumer hardware.
Quantizing equivariant GNNs no longer has to break symmetry: GAQ achieves FP32 accuracy with W4A8 models, 2.39x speedup, and 4x memory reduction, all while slashing equivariance errors by 30x.
Equivariant Graph Neural Networks (GNNs) are essential for physically consistent molecular simulations but suffer from high computational costs and memory bottlenecks, especially with high-order representations. While low-bit quantization offers a solution, applying it naively to rotation-sensitive features destroys the SO(3)-equivariant structure, leading to significant errors and violations of conservation laws. To address this issue, in this work, we propose a Geometric-Aware Quantization (GAQ) framework that compresses and accelerates equivariant models while rigorously preserving continuous symmetry in discrete spaces. Our approach introduces three key contributions: (1) a Magnitude-Direction Decoupled Quantization (MDDQ) scheme that separates invariant lengths from equivariant orientations to maintain geometric fidelity; (2) a symmetry-aware training strategy that treats scalar and vector features with distinct quantization schedules; and (3) a robust attention normalization mechanism to stabilize gradients in low-bit regimes. Experiments on the rMD17 benchmark demonstrate that our W4A8 models match the accuracy of FP32 baselines (9.31 meV vs. 23.20 meV) while reducing Local Equivariance Error (LEE) by over 30x compared to naive quantization. On consumer hardware, GAQ achieves 2.39x inference speedup and 4x memory reduction, enabling stable, energy-conserving molecular dynamics simulations for nanosecond timescales.