NorthwesternZhejiang LabJun 17, 2026arXiv:2606.18946

SenFlow: Inter-Sentence Flow Modeling for AI-Generated Text Detection in Hybrid Documents

Jingkun Luo, Yifan Sun, Da-Tian Peng, Guanxiong Pei

AI Summary

This paper addresses the limitations of existing sentence-level AI-generated text detection (S-AGTD) methods by introducing SenFlow, which models inter-sentence dependencies through structured prediction. The authors construct a new benchmark, MOSAIC, comprising 16,000 hybrid documents generated under strict quality controls, enabling a more comprehensive evaluation of detection methods. SenFlow achieves state-of-the-art performance on MOSAIC, outperforming previous models by an average of 4.15 percentage points in Macro-F1 on cross-domain transfer tasks, while revealing that AI-generated text retains detectable patterns even after quality filtering.

Key Contribution

Even after rigorous quality controls, AI-generated text still reveals detectable patterns that traditional sentence-level detectors can exploit.

Abstract

Sentence-level AI-generated text detection (S-AGTD) for hybrid documents, where humans and LLMs co-author one text, faces two gaps: existing methods classify each sentence in isolation, discarding inter-sentence dependencies, and existing benchmarks omit the newest generation of generators. We construct MOSAIC, a benchmark of 16,000 hybrid documents over PubMed and XSum, generated by DeepSeek-V3.2 and Kimi K2 under stringent quality controls including a perplexity-consistency filter absent from prior benchmarks. We recast S-AGTD as structured prediction over the document sentence sequence and instantiate it as SenFlow, integrating graph-based inter-sentence propagation with linear-chain CRF decoding in a single document-level pass over a sentence graph. SenFlow reaches state-of-the-art performance on MOSAIC, with a +4.15 pp average Macro-F1 margin on cross-domain transfer, the hardest of three protocols of increasing difficulty. We further find that even after the perplexity filter equalizes overt cues, AI insertions retain a generator-dependent sentence-length gap that sentence-level detectors still exploit. Code and data: https://github.com/luojingkun22/SenFlow

Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SenFlow: Inter-Sentence Flow Modeling for AI-Generated Text Detection in Hybrid Documents

Related Papers