UCFApr 29, 2026arXiv:2604.27238

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Mahshid Rezakhani, Nowfel Mashnoor, Kimia Azar, Hadi Kamali

AI Summary

SafeTune is introduced to mitigate data poisoning attacks during LLM fine-tuning for RTL code generation, where adversaries inject hardware Trojans. It uses a GNN to detect anomalous circuit structures and a semantic verification module with text embeddings and XGBoost to assess prompt security. Experiments show SafeTune effectively filters poisoned inputs, enhancing robustness without modifying the LLM architecture.

Key Contribution

Defend against hardware Trojans in LLM-generated RTL code by structurally and semantically verifying training data, without needing to alter the underlying LLM.

Abstract

As large language models (LLMs) are increasingly fine-tuned for hardware tasks like RTL code generation, the scarcity of high-quality datasets often leads to the use of rapidly assembled or generated training data. These datasets frequently lack security verification and are highly susceptible to data poisoning attacks. Such poisoning can cause models to generate syntactically valid but insecure hardware modules that bypass standard functionality checks. To address this, we present SafeTune, a framework designed to harden LLM-based RTL generation against poisoning, specifically focusing on hardware Trojan (HT) insertion. SafeTune integrates two core components: (i) a Graph Neural Network (GNN) that models structural properties to identify anomalous circuitry patterns during fine-tuning, and (ii) a semantic verification module using text embeddings and an XGBoost classifier to assess prompt security. By coupling structural and semantic knowledge, SafeTune effectively filters poisoned inputs without sacrificing legitimate data. Experimental results demonstrate that SafeTune significantly enhances the robustness and reliability of LLM fine-tuning without requiring modifications to the underlying model architecture.

Code Generation & Program Synthesis Data Curation & Synthetic Data Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Related Papers