Search papers, labs, and topics across Lattice.
Mango, a multi-agent web navigation method, addresses the inefficiency of single-agent web explorers by dynamically selecting optimal starting URLs based on website structure. It formulates URL selection as a multi-armed bandit problem, using Thompson Sampling to allocate navigation budget across candidate URLs, and incorporates an episodic memory to learn from past navigation attempts. Experiments on WebVoyager and WebWalkerQA demonstrate Mango's superior performance, achieving significant improvements in success rates compared to existing baselines, even when using different LLM backbones.
Multi-agent exploration with dynamic starting points can more than double the success rate of web navigation tasks compared to single-agent methods that start from the root URL.
Existing web agents typically initiate exploration from the root URL, which is inefficient for complex websites with deep hierarchical structures. Without a global view of the website's structure, agents frequently fall into navigation traps, explore irrelevant branches, or fail to reach target information within a limited budget. We propose Mango, a multi-agent web navigation method that leverages the website structure to dynamically determine optimal starting points. We formulate URL selection as a multi-armed bandit problem and employ Thompson Sampling to adaptively allocate the navigation budget across candidate URLs. Furthermore, we introduce an episodic memory component to store navigation history, enabling the agent to learn from previous attempts. Experiments on WebVoyager demonstrate that Mango achieves a success rate of 63.6% when using GPT-5-mini, outperforming the best baseline by 7.3%. Furthermore, on WebWalkerQA, Mango attains a 52.5% success rate, surpassing the best baseline by 26.8%. We also demonstrate the generalizability of Mango using both open-source and closed-source models as backbones. Our data and code are open-source and available at https://github.com/VichyTong/Mango.