Search papers, labs, and topics across Lattice.
This paper introduces $\Psi$-Bench, a benchmark designed to evaluate the ability of large language models (LLMs) to proactively influence users in persuasive dialogues. By simulating clients with personalized profiles based on dialogue histories, the authors assess the performance of 10 leading LLMs across three real-world persuasive scenarios. The results reveal that while models generate coherent arguments, they still struggle with effective persuasion, with an average performance boost of 18.24% when provided with user-specific information, underscoring the potential of persona-sensitive influencing in dialogue systems.
Even state-of-the-art LLMs struggle with persuasion, but incorporating user profiles can significantly enhance their effectiveness by over 18%.
Personalization is a crucial capability of modern language agents. However, current research primarily positions personalized agents as passive responders to user preferences, limiting their ability to interact with users and provide suggestions or guidance proactively. To systematically evaluate such proactive personalization in realistic interactions, we propose $\Psi$-Bench, a benchmark for assessing LLMs'ability to influence realistic users through conversation. We design three real-world interaction scenarios that involve persuasion in $\Psi$-Bench, and endow simulated clients with personal characteristics through explicit user profiles derived from dialogue histories. We evaluate 10 frontier LLMs on $\Psi$-Bench and find that while most models can produce coherent and reasonable arguments, even state-of-the-art models still leave considerable room for improvement in persuasion. We also find that providing access to client profiles yields an average performance gain of 18.24\%, highlighting the importance of user-specific information for effective persuasion. Overall, our work highlights persona-sensitive influencing as a challenging yet practical direction for evaluating and developing more proactive personalized LLM agents. Codes are available at: https://github.com/Hanpx20/Psi-Bench.