Search papers, labs, and topics across Lattice.
This paper introduces Reinforcement Prompt Selection (RPS), a reinforcement learning framework for adaptive information elicitation from users in open-ended dialogues. RPS learns a policy over a pool of prompts to strategically elicit concealed or incompletely expressed information. Experiments on a synthetic task and a new legal case benchmark (IELegal) demonstrate that RPS outperforms static prompt baselines, highlighting the benefits of adaptive prompt selection for information elicitation.
RL can teach LLMs to be better interviewers, adaptively prompting users to reveal hidden information in dialogue.
Large language models (LLMs) have shown remarkable capabilities in dialogue generation and reasoning, yet their effectiveness in eliciting user-known but concealed information in open-ended conversations remains limited. In many interactive AI applications, such as personal assistants, tutoring systems, and legal or clinical support, users often withhold sensitive or uncertain information due to privacy concerns, ambiguity, or social hesitation. This makes it challenging for LLMs to gather complete and contextually relevant inputs. In this work, we define the problem of information elicitation in open-ended dialogue settings and propose Reinforcement Prompt Selection (RPS), a lightweight reinforcement learning framework that formulates prompt selection as a sequential decision-making problem. To analyze this problem in a controlled setting, we design a synthetic experiment, where a reinforcement learning agent outperforms a random query baseline, illustrating the potential of policy-based approaches for adaptive information elicitation. Building on this insight, RPS learns a policy over a pool of prompts to adaptively elicit concealed or incompletely expressed information from users through dialogue. We also introduce IELegal, a new benchmark dataset constructed from real legal case documents, which simulates dialogue-based information elicitation tasks aimed at uncovering case-relevant facts. In this setting, RPS outperforms static prompt baselines, demonstrating the effectiveness of adaptive prompt selection for eliciting critical information in LLM-driven dialogue systems.