Search papers, labs, and topics across Lattice.
This paper investigates the privacy risks associated with synthetic Instagram posts generated by large language models, using authorship attribution as a proxy for re-identification risk. They trained a RoBERTa-large classifier to distinguish authors based on their posts, achieving high accuracy on real data but significantly lower accuracy on synthetic data, indicating a reduction in privacy risk. The study also assessed the fidelity of the synthetic data and found a trade-off between privacy and fidelity, where higher fidelity correlated with increased privacy leakage.
Synthetic social media data generated by LLMs isn't as private as you might think: authorship attribution attacks can still re-identify individuals with up to 30% accuracy.
Synthetic data is increasingly used to support research without exposing sensitive user content. Social media data is one of the types of datasets that would hugely benefit from representative synthetic equivalents that can be used to bootstrap research and allow reproducibility through data sharing. However, recent studies show that (tabular) synthetic data is not inherently privacy-preserving. Much less is known, however, about the privacy risks of synthetically generated unstructured texts. This work evaluates the privacy of synthetic Instagram posts generated by three state-of-the-art large language models using two prompting strategies. We propose a methodology that quantifies privacy by framing re-identification as an authorship attribution attack. A RoBERTa-large classifier trained on real posts achieved 81\% accuracy in authorship attribution on real data, but only 16.5--29.7\% on synthetic posts, showing reduced, though non-negligible, risk. Fidelity was assessed via text traits, sentiment, topic overlap, and embedding similarity, confirming the expected trade-off: higher fidelity coincides with greater privacy leakage. This work provides a framework for evaluating privacy in synthetic text and demonstrates the privacy--fidelity tension in social media datasets.