Search papers, labs, and topics across Lattice.
This paper investigates the development of domain-specific reasoning in small (7B) language models by fine-tuning them on Quantum Field Theory (QFT) problems. To overcome the scarcity of training data, the authors created a data generation pipeline for synthetic problems and adapted human-authored problems. Through supervised fine-tuning and reinforcement learning, they demonstrate significant performance gains in QFT reasoning and analyze the evolution of reasoning errors.
Small language models can achieve strong performance in specialized scientific domains like quantum field theory with targeted fine-tuning and synthetic data generation.
Despite the growing application of Large Language Models (LLMs) to theoretical physics, there is little academic exploration into how domain-specific physics reasoning ability develops while training these models. To investigate this, we perform the first academic fine-tuning study of small (7B-parameter) reasoning models dedicated specifically to theoretical physics. Because open-source verifiable training data required to train such capabilities is scarce, we developed a robust data generation pipeline that can both create synthetic problems and make existing human-authored problems suitable for model training. Selecting Quantum Field Theory (QFT) as our primary domain, we generated over 2,500 synthetic problems alongside a curated collection of human-adapted problems sourced from arXiv and standard pedagogical resources. We conduct both Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) experiments, benchmarking performance gains as well as generalization to other physics domains. We perform an extensive analysis of model chains-of-though before and after fine-tuning, to understand how reasoning errors evolve during RL and SFT. Finally, we publicly release our data pipeline, verifiable QFT training data, and $\sim$200M tokens of QFT reasoning traces.