Search papers, labs, and topics across Lattice.
This paper introduces Pretrained model Guided Genetic Programming (PGGP), a novel symbolic regression approach that leverages a pretrained Transformer model to guide the GP search process. PGGP uses the Transformer's output to inform the initialization and mutation operators within the GP framework, aiming to improve search efficiency and solution quality. Experimental results demonstrate that PGGP achieves higher accuracy and generates simpler, more interpretable solutions compared to existing symbolic regression methods.
By seeding genetic programs with Transformer-generated equations, PGGP leapfrogs traditional symbolic regression, finding more accurate and interpretable solutions.
Symbolic Regression (SR) is a powerful technique for uncovering hidden mathematical expressions from observed data and has broad applications in scientific discovery and automatic programming. Genetic Programming (GP) has traditionally been the dominant technique for solving the SR, benefiting from a robust global search capability that enables the discovery of solutions with high fitting accuracy. Whereas, GP suffers from low search efficiency and may not fully exploit the accumulated knowledge to accelerate convergence. Conversely, deep learning-based methods, particularly those employing Transformer backbones, are trained offline on large-scale datasets. These methods exhibit strong generalization capabilities for unseen tasks without additional training. However, the lack of refinement mechanisms for specific tasks renders them inferior to GP methods in accuracy. This study aims to combine the specific problem-solving capabilities of GP with the generalization strengths of pretrained Transformer models. Specifically, we propose a pretrained model guided GP (PGGP) method, which is a GP-based method that incorporates a pretrained Transformer model to enhance SR problem-solving. New initialization and mutation operators are proposed based on the well-structured equation obtained using the pre-trained model. Extensive experiments are conducted, and the results show that our method not only surpasses comparative methods in terms of accuracy but also reduces the complexity of the generated solutions, potentially enhancing interpretability.