Tsinghua AINTUSMUUniversity of MassachusettsXJTUApr 20, 2026arXiv:2604.17909

Weaponizing the Commons: A Taxonomy and Detection Framework of Abuse on GitHub

Yuli Cheng, Xiaoyu Zhang, Jiongchi Yu, Shiqing Ma, Chao Shen, Yang Liu

AI Summary

This paper investigates abuse behaviors on GitHub, which are often overlooked despite GitHub's critical role in software supply chains. The authors curate a labeled dataset of 392 GitHub abuse instances and propose a taxonomy categorizing abuse symptoms and root causes from a software security perspective. They then develop a unified detection framework that achieves high performance (F1-score > 89%) in identifying various abuse categories across repositories and user accounts.

Key Contribution

GitHub abuse is more widespread and varied than previously thought, demanding a unified detection approach to safeguard software supply chains.

Abstract

GitHub plays a critical role in modern software supply chains, making its security an important research concern. Existing studies have primarily focused on CI/CD automation, collaboration patterns, and community management, while abuse behaviors on GitHub have received little systematic investigation. In this paper, we systematically review and summarize reported GitHub abuse behaviors and conduct an empirical analysis of publicly available abuse cases, curating a manually labeled dataset of 392 GitHub instances. Based on this investigation, we propose a comprehensive taxonomy that characterizes their diverse symptoms and root causes from a software security perspective. Building on this taxonomy, we develop a unified detection framework capable of identifying all abuse categories across repositories and user accounts. Evaluated on the constructed dataset, the proposed framework achieves high performance across all categories (e.g., F1-score exceeding 89%). Collectively, this work advances the understanding of GitHub abuse behaviors and lays the groundwork for large-scale, systematic analysis of the GitHub platform to strengthen software supply chain security.

Code Generation & Program Synthesis Data Curation & Synthetic Data Open-Source Models & Weights

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Weaponizing the Commons: A Taxonomy and Detection Framework of Abuse on GitHub

Related Papers