May 25, 2026arXiv:2605.25474

TypedCSIP: Typed Counterfactual Pretraining for Chinese Legislative Conflict Classification

AI Summary

TypedCSIP pretrains a shared encoder using expert-written minimal revisions of legal provisions as counterfactual supervision to improve conflict classification. The method uses a typed Counterfactual Selective Intervention Pretraining objective, classifying expert revisions as carrying no conflict evidence. Experiments on the LCR-CN benchmark demonstrate that TypedCSIP improves macro-F1 score over strong baselines, with gains of +0.916 pp and +1.288 pp on different backbones, while a cross-task diagnostic shows the encoder is classification-specialized.

Key Contribution

Expert revisions of legal text can be leveraged as counterfactuals to significantly boost conflict classification accuracy.

Abstract

TypedCSIP is a typed counterfactual pretraining method for the conflict-classification task of the LCR-CN benchmark (Zhao et al., 2026): given a (superior, subordinate) provision pair, predict whether the pair conflicts and which of four legal-doctrine types (Responsibility, Condition, Sanction, Definition) describes the inconsistency. We exploit LCR-CN's expert-written minimal revisions as training-time counterfactual supervision; at test time the classifier reads only the original pair. Stage 1 pretrains a shared encoder with a typed Counterfactual Selective Intervention Pretraining objective on (superior, subordinate, expert-revised) triplets, treating the expert revision as a counterfactual that the typed factor head must classify as carrying no conflict evidence. Stage 2 transfers the encoder to a five-way classification head. The confirmatory test was registered on the Open Science Framework before observing v6 measurements: 18 seeds, locked rule requiring mean per-seed difference at least 0.8 pp with both seed-bootstrap and Student-t 95% lower bounds above zero. On the 696-record test split, the v2 variant improves macro-F1 over the strongest single-model baseline by +0.916 pp on chinese-roberta-wwm-ext and +1.288 pp on the SAILER cross-backbone replication; both cells pass the rule. A cold-start stratified result on the 244 Unseen-gB records keeps the gain positive on both backbones. A cross-task diagnostic shows the Stage-2 encoder is classification-specialized and does not transfer to LCR-CN's superior-law retrieval task, so we scope the contribution to conflict classification. We release code, 72 pre-registered prediction files, matched-seed and MLM-control auxiliaries, and the OSF pre-registration record.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

TypedCSIP: Typed Counterfactual Pretraining for Chinese Legislative Conflict Classification

Related Papers