Search papers, labs, and topics across Lattice.
This paper introduces DKD-KAN, a lightweight intrusion detection framework that uses a knowledge-distilled Kolmogorov-Arnold Network (KAN) to train a smaller MLP model. A high-capacity KAN is first trained as a teacher model, which then guides a smaller MLP student model using decoupled knowledge distillation (DKD). The resulting DKD-MLP model achieves F1-score improvements of 4.18% on WADI and 3.07% on SWaT datasets compared to the bare student model, while having significantly fewer parameters than the KAN teacher.
Shrinking intrusion detection models by distilling a KAN into a much smaller MLP yields surprisingly strong performance gains in resource-constrained environments.
Cyber-security systems often operate in resource-constrained environments, such as edge environments and real-time monitoring systems, where model size and inference time are crucial. A light-weight intrusion detection framework is proposed that utilizes the Kolmogorov-Arnold Network (KAN) to capture complex features in the data, with the efficiency of decoupled knowledge distillation (DKD) training approach. A high-capacity KAN network is first trained to detect attacks performed on the test bed. This model then serves as a teacher to guide a much smaller multilayer perceptron (MLP) student model via DKD. The resulting DKD-MLP model contains only 2,522 and 1,622 parameters for WADI and SWaT datasets, which are significantly smaller than the number of parameters of the KAN teacher model. This is highly appropriate for deployment in resource-constrained devices with limited computational resources. Despite its low size, the student model maintains a high performance. Our approach demonstrate the practicality of using KAN as a knowledge-rich teacher to train much smaller student models, without considerable drop in accuracy in intrusion detection frameworks. We have validated our approach on two publicly available datasets. We report F1-score improvements of 4.18% on WADI and 3.07% on SWaT when using the DKD-MLP model, compared to the bare student model. The implementation of this paper is available on our GitHub repository.