Search papers, labs, and topics across Lattice.
SciCore-Mol is introduced as a modular framework to enhance LLMs' ability to reason about molecules by integrating topology-aware perception, latent diffusion-based molecular generation, and reaction-aware reasoning modules. These modules are coupled to the LLM via learned representation interfaces, enabling richer information exchange compared to text-only methods. Experiments across chemical tasks show SciCore-Mol achieves strong performance in molecular understanding, generation, reaction prediction, and general chemistry knowledge, rivaling proprietary large models with an 8B-parameter open-source system.
An open-source 8B model can now rival proprietary LLMs on a range of complex chemistry tasks by plugging in modules that give it explicit molecular "cognition".
Large Language Models (LLMs) are central to the one-for-all intelligent paradigm, but they face a fundamental challenge when dealing with heterogeneous scientific data such as molecules: the inherent gap between discrete linguistic symbols and topological molecular or continuous reaction data leads to significant information loss and semantic noise in text-based reasoning. We propose SciCore-Mol, a modular framework that bridges this gap through three deeply integrated pluggable cognitive modules: a topology-aware perception module, a latent diffusion-based molecular generation module, and a reaction-aware reasoning module. Each module is coupled to the LLM backbone through learned representation interfaces, enabling richer information exchange than is possible with text-only tool feedback. Our experiments on diverse chemical tasks demonstrate that SciCore-Mol achieves strong comprehensive performance across molecular understanding, generation, reaction prediction, and general chemistry knowledge, with an 8B-parameter open-source system that is competitive with and in several dimensions surpasses proprietary large models. This work provides a systematic blueprint for equipping LLMs with scientific expertise through decoupled, pluggable, and flexibly orchestrated modules, with direct implications for drug design, chemical synthesis, and broader scientific discovery.