Mar 12, 2026arXiv:2603.12152

LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation

AI Summary

LifeSim is introduced as a user simulator that models user cognition using the Belief-Desire-Intention (BDI) framework within physical environments to generate coherent life trajectories and intention-driven interactive behaviors. Based on LifeSim, the authors create LifeSim-Eval, a benchmark comprising 8 life domains and 1,200 scenarios, designed for multi-turn interactive assessment of personalized assistants. Experiments using LifeSim-Eval reveal that current LLMs struggle with implicit intention understanding and long-term user preference modeling in both single-scenario and long-horizon settings.

Key Contribution

Current LLMs fall short in understanding implicit intentions and modeling long-term user preferences, as revealed by a new benchmark, LifeSim-Eval, designed to simulate real-world user-assistant interactions.

Abstract

The rapid advancement of large language models (LLMs) has accelerated progress toward universal AI assistants. However, existing benchmarks for personalized assistants remain misaligned with real-world user-assistant interactions, failing to capture the complexity of external contexts and users'cognitive states. To bridge this gap, we propose LifeSim, a user simulator that models user cognition through the Belief-Desire-Intention (BDI) model within physical environments for coherent life trajectories generation, and simulates intention-driven user interactive behaviors. Based on LifeSim, we introduce LifeSim-Eval, a comprehensive benchmark for multi-scenario, long-horizon personalized assistance. LifeSim-Eval covers 8 life domains and 1,200 diverse scenarios, and adopts a multi-turn interactive method to assess models'abilities to complete explicit and implicit intentions, recover user profiles, and produce high-quality responses. Under both single-scenario and long-horizon settings, our experiments reveal that current LLMs face significant limitations in handling implicit intention and long-term user preference modeling.

Eval Frameworks & Benchmarks Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References42

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation

Related Papers