Search papers, labs, and topics across Lattice.
This paper tackles the problem of fine-grained detection of LLM-generated text by distinguishing between the roles of "creator" and "editor." They introduce RACE, a method that uses Rhetorical Structure Theory to model the creator's logical foundation and extracts Elementary Discourse Unit-level features to capture the editor's stylistic influence. Experiments demonstrate that RACE outperforms existing methods in a four-class setting, enabling more nuanced identification of LLM-generated text.
LLM-generated text detection gets a major upgrade: RACE spots the difference between AI as author versus AI as editor, unlocking policy-aligned regulation.
The misuse of large language models (LLMs) requires precise detection of synthetic text. Existing works mainly follow binary or ternary classification settings, which can only distinguish pure human/LLM text or collaborative text at best. This remains insufficient for the nuanced regulation, as the LLM-polished human text and humanized LLM text often trigger different policy consequences. In this paper, we explore fine-grained LLM-generated text detection under a rigorous four-class setting. To handle such complexities, we propose RACE (Rhetorical Analysis for Creator-Editor Modeling), a fine-grained detection method that characterizes the distinct signatures of creator and editor. Specifically, RACE utilizes Rhetorical Structure Theory to construct a logic graph for the creator's foundation while extracting Elementary Discourse Unit-level features for the editor's style. Experiments show that RACE outperforms 12 baselines in identifying fine-grained types with low false alarms, offering a policy-aligned solution for LLM regulation.