Search papers, labs, and topics across Lattice.
HypoExplore is introduced, an agentic framework that uses LLMs to explore neural architectures for visual recognition through hypothesis-driven scientific inquiry. It leverages a dual strategy of exploiting validated principles and resolving uncertain ones to guide the evolution of architectures, maintaining a Trajectory Tree and Hypothesis Memory Bank to track lineage and confidence scores. Experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet, and MedMNIST demonstrate the framework's ability to discover high-performing architectures and transfer learned principles across lineages.
LLMs can autonomously discover novel neural architectures that achieve state-of-the-art performance in specialized domains, suggesting a path towards automated scientific discovery.
We introduce HypoExplore, an agentic framework that formulates neural architecture discovery for visual recognition as a hypothesis-driven scientific inquiry. Given a human-specified high-level research direction, HypoExplore ideates, implements, evaluates, and improves neural architectures through evolutionary branching. New hypotheses are created using a large language model by selecting a parent hypothesis to build upon, guided by a dual strategy that balances exploiting validated principles with resolving uncertain ones. Our proposed framework maintains a Trajectory Tree that records the lineage of all proposed architectures, and a Hypothesis Memory Bank that actively tracks confidence scores acquired through experimental evidence. After each experiment, multiple feedback agents analyze the results from different perspectives and consolidate their findings into hypothesis confidence updates. Our framework is tested on discovering lightweight vision architectures on CIFAR-10, with the best achieving 94.11% accuracy evolved from a root node baseline that starts at 18.91%, and generalizes to CIFAR-100 and Tiny-ImageNet. We further demonstrate applicability to a specialized domain by conducting independent architecture discovery runs on MedMNIST, which yield a state-of-the-art performance. We show that hypothesis confidence scores grow increasingly predictive as evidence accumulates, and that the learned principles transfer across independent evolutionary lineages, suggesting that HypoExplore not only discovers stronger architectures, but can help build a genuine understanding of the design space.