Search papers, labs, and topics across Lattice.
The paper introduces HuggingR$^4$, a novel framework for selecting optimal AI models from large repositories like Hugging Face by framing model selection as an iterative reasoning process. HuggingR$^4$ integrates Reasoning, Retrieval, Refinement, and Reflection to decompose user intent, retrieve candidates, refine selections, and validate results. Experiments on a new benchmark of 14,399 user requests demonstrate that HuggingR$^4$ significantly outperforms existing methods in workability and reasonability while reducing token consumption.
LLM agents can now navigate the vast model zoo of HuggingFace with 6.9x less token consumption and 33% better reasoning, thanks to a new iterative selection framework.
Building effective LLM agents increasingly requires selecting appropriate AI models as tools from large open repositories (e.g., HuggingFace with>2M models) based on natural language requests. Unlike invoking a fixed set of API tools, repository-scale model selection must handle massive, evolving candidates with incomplete metadata. Existing approaches incorporate full model descriptions into prompts, resulting in prompt bloat, excessive token costs, and limited scalability. To address these issues, we propose HuggingR$^4$, the first framework to recast model selection as an iterative reasoning process rather than one-shot retrieval. By synergistically integrating Reasoning, Retrieval, Refinement, and Reflection, HuggingR$^4$ progressively decomposes user intent, retrieves candidates through multi-round deliberation, refines selections via fine-grained analysis, and validates results through reflection. To facilitate rigorous evaluation, we introduce a large-scale benchmark comprising 14,399 diverse user requests across 37 task categories. Experiments demonstrate that HuggingR$^4$ achieves 92.03% workability and 82.46% reasonability-outperforming current state-of-the-art baselines by 26.51% and 33.25%, respectively, while reducing token consumption by $6.9 \times$.