Search papers, labs, and topics across Lattice.
The study introduces LMP2, a human-centered tool for auditing personal data (PD) associations in LLMs, and uses it to evaluate eight LLMs, including GPT-4o, on their ability to generate PD for both well-known and everyday individuals. The research reveals that LLMs confidently generate various PD categories, particularly for well-known individuals, and that GPT-4o accurately generates 11 features for everyday users. A user study indicates that a significant majority of participants desire control over model-generated associations with their name, highlighting the need to re-evaluate the definition of PD and the scope of data privacy rights in the context of LLMs.
GPT-4o accurately infers surprisingly sensitive personal features like gender and hair color for everyday users, raising serious questions about LLM privacy.
Large language models (LLMs), and conversational agents based on them, are exposed to personal data (PD) during pre-training and during user interactions. Prior work shows that PD can resurface, yet users lack insight into how strongly models associate specific information to their identity. We audit PD across eight LLMs (3 open-source; 5 API-based, including GPT-4o), introduce LMP2 (Language Model Privacy Probe), a human-centered, privacy-preserving audit tool refined through two formative studies (N=20), and run two studies with EU residents to capture (i) intuitions about LLM-generated PD (N1=155) and (ii) reactions to tool output (N2=303). We show empirically that models confidently generate multiple PD categories for well-known individuals. For everyday users, GPT-4o generates 11 features with 60% or more accuracy (e.g., gender, hair color, languages). Finally, 72% of participants sought control over model-generated associations with their name, raising questions about what counts as PD and whether data privacy rights should extend to LLMs.