Search papers, labs, and topics across Lattice.
PackFlow, a flow matching framework, is introduced for molecular crystal structure prediction (CSP) by jointly sampling Cartesian coordinates and unit-cell lattice parameters conditioned on a molecular graph. To guide generation towards physically plausible regions, a reinforcement learning post-training stage called physics alignment is used, leveraging machine-learned interatomic potential energies and forces as stability proxies. Experiments on unseen molecular systems and CSP blind-test case studies demonstrate that PackFlow generates candidates with greater structural similarity to experimental polymorphs and concentrates probability mass in low-energy basins compared to heuristic baselines.
By aligning a generative flow network with physics-based stability proxies via reinforcement learning, PackFlow drastically improves the efficiency of molecular crystal structure prediction, offering a practical route to circumvent the costly relax-and-rank bottleneck.
Organic molecular crystals underpin technologies ranging from pharmaceuticals to organic electronics, yet predicting solid-state packing of molecules remains challenging because candidate generation is combinatorial and stability is only resolved after costly energy evaluations. Here we introduce PackFlow, a flow matching framework for molecular crystal structure prediction (CSP) that generates heavy-atom crystal proposals by jointly sampling Cartesian coordinates and unit-cell lattice parameters given a molecular graph. This lattice-aware generation interfaces directly with downstream relaxation and lattice-energy ranking, positioning PackFlow as a scalable proposal engine within standard CSP pipelines. To explicitly steer generation toward physically favourable regions, we propose physics alignment, a reinforcement learning post-training stage that uses machine-learned interatomic potential energies and forces as stability proxies. Physics alignment improves physical validity without altering inference-time sampling. We validate PackFlow's performance against heuristic baselines through two distinct evaluations. First, on a broad unseen set of molecular systems, we demonstrate superior candidate generation capability, with proposals exhibiting greater structural similarity to experimental polymorphs. Second, we assess the full end-to-end workflow on two unseen CSP blind-test case studies, including relaxation and lattice-energy analysis. In both settings, PackFlow outperforms heuristics-based methods by concentrating probability mass in low-energy basins, yielding candidates that relax into lower-energy minima and offering a practical route to amortize the relax-and-rank bottleneck.