Mar 11, 2026arXiv:2603.10505

Safe and Scalable Web Agent Learning via Recreated Websites

Hyungjoo Chae, Jungsoo Park, Alan Ritter

AI Summary

VeriEnv is introduced, a framework that leverages language models to clone real-world websites into fully executable, verifiable synthetic environments for training web agents. This approach allows for controlled internal access via a Python SDK, enabling agents to self-generate tasks with deterministic, programmatically verifiable rewards. Experiments demonstrate that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments.

Key Contribution

Train web-navigating agents in safe, scalable, and verifiable synthetic environments automatically cloned from real websites, sidestepping the risks and limitations of real-world interaction.

Abstract

Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.

Data Curation & Synthetic Data Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Safe and Scalable Web Agent Learning via Recreated Websites

Related Papers