Search papers, labs, and topics across Lattice.
This paper explores using LLMs for neural architecture search by placing a code-oriented LLM in a closed-loop synthesis framework with iterative fine-tuning based on performance feedback and novelty filtering. The LLM synthesizes PyTorch convolutional networks, which are validated, evaluated on single-epoch accuracy, and filtered for structural redundancy using MinHash-Jaccard. Results show the LLM internalizes architectural priors, improving the valid generation rate and accuracy, and synthesizing novel, high-performing architectures not present in the original training data.
LLMs can evolve into autonomous neural architecture designers, learning to generate novel and high-performing architectures by internalizing execution feedback, even surpassing their initial training data.
Large language models (LLMs) excel in program synthesis, yet their ability to autonomously navigate neural architecture design--balancing syntactic reliability, performance, and structural novelty--remains underexplored. We address this by placing a code-oriented LLM within a closed-loop synthesis framework, analyzing its evolution over 22 supervised fine-tuning cycles. The model synthesizes PyTorch convolutional networks which are validated, evaluated via low-fidelity performance signals (single-epoch accuracy), and filtered using a MinHash-Jaccard criterion to prevent structural redundancy. High-performing, novel architectures are converted into prompt-code pairs for iterative fine-tuning via parameter-efficient LoRA adaptation, initialized from the LEMUR dataset. Across cycles, the LLM internalizes empirical architectural priors, becoming a robust generator. The valid generation rate stabilizes at 50.6 percent (peaking at 74.5 percent), while mean first-epoch accuracy rises from 28.06 percent to 50.99 percent, and the fraction of candidates exceeding 40 percent accuracy grows from 2.04 percent to 96.81 percent. Analyses confirm the model moves beyond replicating existing motifs, synthesizing 455 high-performing architectures absent from the original corpus. By grounding code synthesis in execution feedback, this work provides a scalable blueprint for transforming stochastic generators into autonomous, performance-driven neural designers, establishing that LLMs can internalize empirical, non-textual rewards to transcend their training data.