Search papers, labs, and topics across Lattice.
FlowPIE is introduced as a novel framework for scientific idea generation that tightly integrates literature exploration and idea generation. It employs a flow-guided Monte Carlo Tree Search (MCTS), inspired by GFlowNets, to expand literature trajectories and construct a diverse initial population, guided by an LLM-based generative reward model (GRM). The framework then models idea generation as a test-time evolutionary process, using selection, crossover, and mutation with GRM-based fitness computation, resulting in ideas with higher novelty, feasibility, and diversity compared to existing methods.
Forget static retrieval: FlowPIE's flow-guided literature exploration and evolutionary idea generation unlocks more novel, feasible, and diverse scientific ideas.
Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.