Tsinghua AIBeijing University of Posts and Telecommu-China Telecom Corporation LimitedSouthwestern Univer- sity of Finance and EconomicsTU EindhovenApr 15, 2026arXiv:2604.13817

RPS: Information Elicitation with Reinforcement Prompt Selection

Tao Wang, Jingyao Lu, Xibo Wang, Haonan Huang, Su Yao, Zhiqiang Hu, Xingyan Chen, Enmao Diao

AI Summary

This paper introduces Reinforcement Prompt Selection (RPS), a reinforcement learning framework for adaptive information elicitation from users in open-ended dialogues. RPS learns a policy over a pool of prompts to strategically elicit concealed or incompletely expressed information. Experiments on a synthetic task and a new legal case benchmark (IELegal) demonstrate that RPS outperforms static prompt baselines, highlighting the benefits of adaptive prompt selection for information elicitation.

Key Contribution

RL can teach LLMs to be better interviewers, adaptively prompting users to reveal hidden information in dialogue.

Abstract

Large language models (LLMs) have shown remarkable capabilities in dialogue generation and reasoning, yet their effectiveness in eliciting user-known but concealed information in open-ended conversations remains limited. In many interactive AI applications, such as personal assistants, tutoring systems, and legal or clinical support, users often withhold sensitive or uncertain information due to privacy concerns, ambiguity, or social hesitation. This makes it challenging for LLMs to gather complete and contextually relevant inputs. In this work, we define the problem of information elicitation in open-ended dialogue settings and propose Reinforcement Prompt Selection (RPS), a lightweight reinforcement learning framework that formulates prompt selection as a sequential decision-making problem. To analyze this problem in a controlled setting, we design a synthetic experiment, where a reinforcement learning agent outperforms a random query baseline, illustrating the potential of policy-based approaches for adaptive information elicitation. Building on this insight, RPS learns a policy over a pool of prompts to adaptively elicit concealed or incompletely expressed information from users through dialogue. We also introduce IELegal, a new benchmark dataset constructed from real legal case documents, which simulates dialogue-based information elicitation tasks aimed at uncovering case-relevant facts. In this setting, RPS outperforms static prompt baselines, demonstrating the effectiveness of adaptive prompt selection for eliciting critical information in LLM-driven dialogue systems.

Natural Language Processing RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

RPS: Information Elicitation with Reinforcement Prompt Selection

Related Papers