IITPoly MontrealMay 5, 2026arXiv:2605.04000

Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning

P. Akilesh, L. D. Silva, F. Khomh, Sridhar Chimalakonda

AI Summary

This paper introduces an RL-based approach to reduce false positives in static memory safety analysis for Rust, using contextual features from Rust's MIR to learn a warning suppression policy. The RL agent is trained with feedback from static analysis outputs, supplemented by dynamic validation via cargo-fuzz to selectively validate suspicious warnings. Results show a significant improvement over LLM baselines, achieving 65.2% accuracy and an F1 score of 0.659, while also improving precision from 25.6% to 59.0%.

Key Contribution

Rust developers can slash the noise in static analysis alerts by over 50% using an RL agent that learns to suppress false positives, outperforming even LLM-based methods.

Abstract

Static analysis tools are essential for ensuring memory safety in Rust programs, particularly as Rust gains adoption in safety-critical domains. However, existing tools such as Rudra and MirChecker suffer from high false positive rates, which diminish developer trust, increase manual review effort, and may obscure genuine vulnerabilities. This paper presents a novel reinforcement learning (RL)-based approach for automatically classifying and suppressing spurious warnings in static memory safety analysis for Rust. To achieve this, we design an RL agent that learns a warning suppression policy by extracting contextual features from Rust's Mid-level Intermediate Representation (MIR) and optimizing its decisions through interaction with static analysis outputs. To improve decision quality, we integrate dynamic validation via cargo-fuzz as an auxiliary feedback mechanism, allowing the agent to selectively validate suspicious warnings through targeted fuzz testing. Our evaluation shows that the proposed approach significantly outperforms state-of-the-art LLM-based baselines, achieving 65.2% accuracy and an F1 score of 0.659, an improvement of 17.1% over the best LLM baseline. With a recall of 74.6%, our method successfully identifies nearly three-quarters of true bugs while substantially reducing false positives, improving precision from 25.6% in raw Rudra output to 59.0%. Incorporating dynamic fuzzing further boosts performance, yielding additional improvements of 10.7 percentage points in accuracy and 8.6 percentage points in F1 score over the RL-only variant. Overall, our work demonstrates that combining reinforcement learning with hybrid static-dynamic analysis can substantially reduce false positives and improve the practical usability of memory safety verification tools for Rust.

Code Generation & Program Synthesis Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References53

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning

Related Papers