Search papers, labs, and topics across Lattice.
The paper introduces DeFRiS, a Decentralized Federated Reinforcement Learning framework, to address silo-cooperative IoT application scheduling across autonomous administrative entities while preserving data privacy. DeFRiS employs action-space-agnostic policies using candidate resource scoring for knowledge transfer, silo-optimized local learning with GAE and clipped policy updates, and a dual-track non-IID robust decentralized aggregation protocol using gradient fingerprints and tracking. Experiments on a 20-silo testbed demonstrate that DeFRiS outperforms state-of-the-art baselines in response time, energy consumption, tail latency risk, deadline violations, scalability, and adversarial robustness.
Federated reinforcement learning can now handle heterogeneous, adversarial IoT environments with near-zero deadline violations, thanks to a novel decentralized framework that transfers knowledge across silos.
Next-generation IoT applications increasingly span across autonomous administrative entities, necessitating silo-cooperative scheduling to leverage diverse computational resources while preserving data privacy. However, realizing efficient cooperation faces significant challenges arising from infrastructure heterogeneity, Non-IID workload shifts, and the inherent risks of adversarial environments. Existing approaches, relying predominantly on centralized coordination or independent learning, fail to address the incompatibility of state-action spaces across heterogeneous silos and lack robustness against malicious attacks. This paper proposes DeFRiS, a Decentralized Federated Reinforcement Learning framework for robust and scalable Silo-cooperative IoT application scheduling. DeFRiS integrates three synergistic innovations: (i) an action-space-agnostic policy utilizing candidate resource scoring to enable seamless knowledge transfer across heterogeneous silos; (ii) a silo-optimized local learning mechanism combining Generalized Advantage Estimation (GAE) with clipped policy updates to resolve sparse delayed reward challenges; and (iii) a Dual-Track Non-IID robust decentralized aggregation protocol leveraging gradient fingerprints for similarity-aware knowledge transfer and anomaly detection, and gradient tracking for optimization momentum. Extensive experiments on a distributed testbed with 20 heterogeneous silos and realistic IoT workloads demonstrate that DeFRiS significantly outperforms state-of-the-art baselines, reducing average response time by 6.4% and energy consumption by 7.2%, while lowering tail latency risk (CVaR$_{0.95}$) by 10.4% and achieving near-zero deadline violations. Furthermore, DeFRiS achieves over 3 times better performance retention as the system scales and over 8 times better stability in adversarial environments compared to the best-performing baseline.