Search papers, labs, and topics across Lattice.
This paper introduces WAC, a web agent architecture that enhances action selection and execution through model collaboration, consequence simulation, and feedback-driven action refinement. WAC employs a multi-agent system where an action model consults a world model for strategic guidance, improving action proposal based on simulated environmental state transitions. A two-stage deduction chain, involving a world model simulating action outcomes and a judge model providing corrective feedback, enables risk-aware task execution, leading to performance gains on VisualWebArena and Online-Mind2Web.
Web agents can become significantly more reliable by consulting a world model to simulate the consequences of their actions *before* committing to them.
Web agents based on large language models have demonstrated promising capability in automating web tasks. However, current web agents struggle to reason out sensible actions due to the limitations of predicting environment changes, and might not possess comprehensive awareness of execution risks, prematurely performing risky actions that cause losses and lead to task failure. To address these challenges, we propose WAC, a web agent that integrates model collaboration, consequence simulation, and feedback-driven action refinement. To overcome the cognitive isolation of individual models, we introduce a multi-agent collaboration process that enables an action model to consult a world model as a web-environment expert for strategic guidance; the action model then grounds these suggestions into executable actions, leveraging prior knowledge of environmental state transition dynamics to enhance candidate action proposal. To achieve risk-aware resilient task execution, we introduce a two-stage deduction chain. A world model, specialized in environmental state transitions, simulates action outcomes, which a judge model then scrutinizes to trigger action corrective feedback when necessary. Experiments show that WAC achieves absolute gains of 1.8% on VisualWebArena and 1.3% on Online-Mind2Web.