HanyangApr 20, 2026arXiv:2604.17886

Latent Preference Modeling for Cross-Session Personalized Tool Calling

AI Summary

The paper introduces MPT, a new benchmark for evaluating personalized tool calling in LLM agents across multiple sessions, focusing on preference recall, induction, and transfer. To address the challenge of under-specified user requests, they propose PRefine, a memory-augmented method that iteratively generates, verifies, and refines hypotheses about user preferences based on dialogue history. PRefine achieves improved tool-calling accuracy while significantly reducing token usage compared to full-history prompting, suggesting the importance of capturing the reasoning behind user preferences for effective personalization.

Key Contribution

Forget full-history prompting: this work shows you can slash token costs by 98% while boosting tool-calling accuracy by explicitly modeling and refining latent user preferences.

Abstract

Users often omit essential details in their requests to LLM-based agents, resulting in under-specified inputs for tool use. This poses a fundamental challenge for tool-augmented agents, as API execution typically requires complete arguments, highlighting the need for personalized tool calling. To study this problem, we introduce MPT, a benchmark comprising 265 multi-session dialogues that cover three challenges: Preference Recall, Preference Induction, and Preference Transfer. We also propose PRefine, a test-time memory-augmented method that represents user preferences as evolving hypotheses. Through a generate--verify--refine loop, it extracts reusable constraints from history and improves tool-calling accuracy while using only 1.24% of the tokens required by full-history prompting. These results indicate that robust personalization in agentic systems depends on memory that captures the reasons behind user choices, not just the choices themselves.

Eval Frameworks & Benchmarks Recommendation & Information Retrieval RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References38

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Latent Preference Modeling for Cross-Session Personalized Tool Calling

Related Papers