Search papers, labs, and topics across Lattice.
This paper investigates how LLMs impact human writing, focusing on changes in meaning and tone. Through a user study, they found that heavy LLM use leads to more neutral and less creative writing, with users feeling the writing is not in their own voice. Analyzing human essays revised by LLMs and LLM-generated peer reviews, the authors show that LLMs subtly alter semantic meaning, de-emphasize clarity/significance, and inflate scores.
LLMs don't just change *how* we write, they subtly distort *what* we mean, leading to blander, less insightful, and potentially biased communication.
Large language models (LLMs) are used by over a billion people globally, most often to assist with writing. In this work, we demonstrate that LLMs not only alter the voice and tone of human writing, but also consistently alter the intended meaning. First, we conduct a human user study to understand how people actually interact with LLMs when using them for writing. Our findings reveal that extensive LLM use led to a nearly 70% increase in essays that remained neutral in answering the topic question. Significantly more heavy LLM users reported that the writing was less creative and not in their voice. Next, using a dataset of human-written essays that was collected in 2021 before the widespread release of LLMs, we study how asking an LLM to revise the essay based on the human-written feedback in the dataset induces large changes in the resulting content and meaning. We find that even when LLMs are prompted with expert feedback and asked to only make grammar edits, they still change the text in a way that significantly alters its semantic meaning. We then examine LLM-generated text in the wild, specifically focusing on the 21% of AI-generated scientific peer reviews at a recent top AI conference. We find that LLM-generated reviews place significantly less weight on clarity and significance of the research, and assign scores that, on average, are a full point higher.These findings highlight a misalignment between the perceived benefit of AI use and an implicit, consistent effect on the semantics of human writing, motivating future work on how widespread AI writing will affect our cultural and scientific institutions.