Search papers, labs, and topics across Lattice.
This paper investigates clarification in software engineering tasks, focusing on identifying which information types most impact task success and which questions elicit useful responses. They use Shapley attribution and distributional comparisons to quantify task relevance and user answerability, operationalizing these properties as rewards in a multi-stage reinforcement learning setup. The resulting 8B-parameter model, CLARITI, matches GPT-5's resolution rate while generating 41% fewer questions, demonstrating the effectiveness of grounding reward design in empirical analysis.
Stop wasting tokens on irrelevant questions: reward models that ask about task relevance and user answerability can slash question count by 41% while matching GPT-5's issue resolution rate.
Humans often specify tasks incompletely, so assistants must know when and how to ask clarifying questions. However, effective clarification remains challenging in software engineering tasks as not all missing information is equally valuable, and questions must target information users can realistically provide. We study clarification in real software engineering tasks by quantifying which types of information most affect task success and which questions elicit useful responses from simulated users. Using Shapley attribution and distributional comparisons, we identify two key properties of effective clarification: task relevance (which information predicts success) and user answerability (what users can realistically provide). We operationalize these properties as multi-stage reinforcement learning rewards to train CLARITI, an 8B-parameter clarification module, that matches GPT-5's resolution rate on underspecified issues while generating 41% fewer questions. Our results suggest that grounding reward design in empirical analysis of information impact and user answerability improves clarification efficiency.