Search papers, labs, and topics across Lattice.
This paper benchmarks the performance of several state-of-the-art LLMs (GPT-4, GPT-4o, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2, and BERT) on three social media analytics tasks: authorship verification, post generation, and user attribute inference, using a Twitter dataset. The study introduces a systematic sampling framework for authorship verification to address "seen-data" bias and a user study to evaluate the perceived authenticity of LLM-generated posts. The results provide a comprehensive evaluation and reproducible benchmarks for LLM-driven social media analytics, highlighting the strengths and weaknesses of different models across these tasks.
LLMs struggle to generate social media posts that real users perceive as authentic, even when conditioned on the user's own writing.
In this study, we present the first comprehensive evaluation of modern LLMs - including GPT-4, GPT-4o, GPT-3.5-Turbo, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2, and BERT - across three core social media analytics tasks on a Twitter (X) dataset: (I) Social Media Authorship Verification, (II) Social Media Post Generation, and (III) User Attribute Inference. For the authorship verification, we introduce a systematic sampling framework over diverse user and post selection strategies and evaluate generalization on newly collected tweets from January 2024 onward to mitigate"seen-data"bias. For post generation, we assess the ability of LLMs to produce authentic, user-like content using comprehensive evaluation metrics. Bridging Tasks I and II, we conduct a user study to measure real users'perceptions of LLM-generated posts conditioned on their own writing. For attribute inference, we annotate occupations and interests using two standardized taxonomies (IAB Tech Lab 2023 and 2018 U.S. SOC) and benchmark LLMs against existing baselines. Overall, our unified evaluation provides new insights and establishes reproducible benchmarks for LLM-driven social media analytics. The code and data are provided in the supplementary material and will also be made publicly available upon publication.