Search papers, labs, and topics across Lattice.
This paper introduces Mixture-of-Experts (MoE) and Mixture-of-Linear-Experts (MoLE) architectures to enhance the expressive capacity of Machine Learning Interatomic Potentials (MLIPs). The study systematically analyzes the impact of routing strategies and expert designs, demonstrating that sparse activation with shared experts significantly improves performance. Element-wise MoE routing achieves state-of-the-art accuracy on OMol25, OMat24, and OC20M benchmarks, revealing chemically interpretable expert specialization.
MLIPs get a boost: Mixture-of-Experts architectures unlock state-of-the-art accuracy and chemically interpretable insights in interatomic modeling.
Machine Learning Interatomic Potentials (MLIPs) enable accurate large-scale atomistic simulations, yet improving their expressive capacity efficiently remains challenging. Here we systematically develop Mixture-of-Experts (MoE) and Mixture-of-Linear-Experts (MoLE) architectures for MLIPs and analyze the effects of routing strategies and expert designs. We show that sparse activation combined with shared experts yields substantial performance gains, and that nonlinear MoE formulations outperform MoLE when shared experts are present, underscoring the importance of nonlinear expert specialization. Furthermore, element-wise routing consistently surpasses configuration-level routing, while global MoE routing often leads to numerical instability. The resulting element-wise MoE model achieves state-of-the-art accuracy across the OMol25, OMat24, and OC20M benchmarks. Analysis of routing patterns reveals chemically interpretable expert specialization aligned with periodic-table trends, indicating that the model effectively captures element-specific chemical characteristics for precise interatomic modeling.