Search papers, labs, and topics across Lattice.
The paper introduces CoPA, a new benchmark for evaluating personalized question answering (QA) systems. CoPA leverages Community-Individual Preference Divergence (CIPD) to identify six key personalization factors (e.g., recency, sentiment) and constructs a dataset of 1,985 user profiles for fine-grained evaluation. Experiments using CoPA demonstrate that current QA models struggle to align with user-specific cognitive preferences, highlighting the limitations of generic QA metrics in capturing personalization.
Current QA benchmarks fail to capture nuanced user preferences, but CoPA reveals six key personalization factors that can help models finally understand what users *really* want.
While LLMs have demonstrated remarkable potential in Question Answering (QA), evaluating personalization remains a critical bottleneck. Existing paradigms predominantly rely on lexical-level similarity or manual heuristics, often lacking sufficient data-driven validation. We address this by mining Community-Individual Preference Divergence (CIPD), where individual choices override consensus, to distill six key personalization factors as evaluative dimensions. Accordingly, we introduce CoPA, a benchmark with 1,985 user profiles for fine-grained, factor-level assessment. By quantifying the alignment between model outputs and user-specific cognitive preferences inferred from interaction patterns, CoPA provides a more comprehensive and discriminative standard for evaluating personalized QA than generic metrics. The code is available at https://github.com/bjzgcai/CoPA.