Search papers, labs, and topics across Lattice.
The paper introduces a novel crystal structure prediction (CSP) framework that leverages large language models to generate fine-grained Wyckoff patterns directly from chemical composition, moving beyond database retrieval methods. It enforces algebraic consistency between site multiplicities and atomic stoichiometry using a constrained-optimization search, ensuring physically valid geometric structures. Integrating this symmetry-consistent template into a diffusion backbone, the framework achieves state-of-the-art performance on stability, uniqueness, and novelty benchmarks, demonstrating superior matching performance and the ability to explore uncharted materials space.
Forget database lookups: this new framework uses LLMs to directly generate symmetry-consistent crystal structures from composition, opening up a new frontier in materials discovery.
Crystal structure prediction (CSP), which aims to predict the three-dimensional atomic arrangement of a crystal from its composition, is central to materials discovery and mechanistic understanding. Existing deep learning models often treat crystallographic symmetry only as a soft heuristic or rely on space group and Wyckoff templates retrieved from known structures, which limits both physical fidelity and the ability to discover genuinely new material structures. In contrast to retrieval-based methods, our approach leverages large language models to encode chemical semantics and directly generate fine-grained Wyckoff patterns from composition, effectively circumventing the limitations inherent to database lookups. Crucially, we incorporate domain knowledge into the generative process through an efficient constrained-optimization search that rigorously enforces algebraic consistency between site multiplicities and atomic stoichiometry. By integrating this symmetry-consistent template into a diffusion backbone, our approach constrains the stochastic generative trajectory to a physically valid geometric manifold. This framework achieves state-of-the-art performance across stability, uniqueness, and novelty (SUN) benchmarks, alongside superior matching performance, thereby establishing a new paradigm for the rigorous exploration of targeted crystallographic space. This framework enables efficient expansion into previously uncharted materials space, eliminating reliance on existing databases or a priori structural knowledge.