Search papers, labs, and topics across Lattice.
The paper introduces Policy-Guided Hybrid Simulation (PGHS), a dual-process framework for simulating group-level user behavior in merchant business scenarios. PGHS addresses information incompleteness and mechanism duality by combining an LLM-based reasoning branch, anchored by decision policies mined from behavioral trajectories, with an ML-based fitting branch. Experiments on Meituan demonstrate that PGHS reduces group simulation error by over 40% compared to state-of-the-art reasoning-based and fitting-based baselines.
Combining LLMs with traditional ML beats either alone at simulating complex user behavior, thanks to a clever policy-guided alignment.
Simulating group-level user behavior enables scalable counterfactual evaluation of merchant strategies without costly online experiments. However, building a trustworthy simulator faces two structural challenges. First, information incompleteness causes reasoning-based simulators to over-rationalize when unobserved factors such as offline context and implicit habits are missing. Second, mechanism duality requires capturing both interpretable preferences and implicit statistical regularities, which no single paradigm achieves alone. We propose Policy-Guided Hybrid Simulation (PGHS), a dual-process framework that mines transferable decision policies from behavioral trajectories and uses them as a shared alignment layer. This layer anchors an LLM-based reasoning branch that prevents over-rationalization and an ML-based fitting branch that absorbs implicit regularities. Group-level predictions from both branches are fused for complementary correction. We deploy PGHS on Meituan with 101 merchants and over 26,000 trajectories. PGHS achieves a group simulation error of 8.80%, improving over the best reasoning-based and fitting-based baselines by 45.8% and 40.9% respectively.