Mar 9, 2026arXiv:2603.07992

SI-ChainFL: Shapley-Incentivized Secure Federated Learning for High-Speed Rail Data Sharing

Mingjie Zhao, Cheng Dai, Fei Chen, Kaoru Ota, K. Ota, Mianxiong Dong, Bing Guo, Bin Guo

AI Summary

The paper introduces SI-ChainFL, a federated learning framework for high-speed rail data sharing that addresses incentive issues and centralized aggregation limitations. It uses Shapley value to quantify client contributions based on rare-event utility, data diversity, quality, and timeliness, and employs a rare positive driven client clustering strategy to reduce computational overhead. A blockchain-based consensus protocol is designed for decentralized aggregation, linking aggregation eligibility to Shapley incentives, resulting in improved accuracy and robustness against malicious clients.

Key Contribution

A Shapley-incentivized blockchain boosts federated learning accuracy by 14% and thwarts 90% of malicious attacks in high-speed rail data sharing.

Abstract

In high-speed rail (HSR) systems, federated learning (FL) enables cross-departmental flow prediction without sharing raw data. However, existing schemes suffer from two key limitations: (1) insufficient incentives, leading to free-riding and model poisoning; and (2) centralized aggregation, which introduces a single point of failure. We propose a secure and efficient framework SI-ChainFL that addresses these issues by combining contribution-aware incentives with decentralized aggregation. First, we quantify client contributions using a Shapley value metric that jointly considers rare-event utility, data diversity, data quality, and timeliness. To reduce computational overhead, we further develop a rare positive driven client clustering strategy to accelerate Shapley estimation. Moreover, we design a blockchain-based consensus protocol for decentralized aggregation, where aggregation eligibility is tied to Shapley incentives. This design motivates clients to submit high-quality updates and enables efficient and secure global aggregation. Experiments on MNIST, CIFAR 10 and CIFAR 100, and a HSR flow dataset show that SI ChainFL remains effective under 90% malicious clients in PA attacks, achieving 14.12% higher accuracy than RAGA. Theoretical analysis further guarantees an upper bound on performance

Data Curation & Synthetic Data Distributed Systems & Hardware

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SI-ChainFL: Shapley-Incentivized Secure Federated Learning for High-Speed Rail Data Sharing

Related Papers