Mar 18, 2026arXiv:2603.17717

Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation

Iakovos-Christos Zarkadis, C. Douligeris, Christos Douligeris

AI Summary

This paper introduces a unified multi-modal network intrusion detection system (NIDS) dataset built from existing datasets, incorporating flow-level data, packet payload information, and temporal contextual features. The authors then train and evaluate various machine learning models for network attack classification using stratified cross-validation on this dataset, achieving stable and reliable intrusion detection. Additionally, they explore adversarial learning techniques for generating synthetic data, assessing its fidelity, utility, and privacy using the SDV framework, f-divergences, and statistical tests.

Key Contribution

Achieve stable and reliable network intrusion detection and high-fidelity synthetic data generation by combining machine learning, adversarial learning, and rigorous statistical evaluation on a new unified multi-modal NIDS dataset.

Abstract

Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more sophisticated attacks that utilize advanced techniques, such as generative artificial intelligence (GenAI) and reinforcement learning, it has become a vital component if we wish to protect our personal data, which are scattered across the web. In this paper, we address two tasks, in the first unified multi-modal NIDS dataset, which incorporates flow-level data, packet payload information and temporal contextual features, from the reprocessed CIC-IDS-2017, CIC-IoT-2023, UNSW-NB15 and CIC-DDoS-2019, with the same feature space. In the first task we use machine learning (ML) algorithms, with stratified cross validation, in order to prevent network attacks, with stability and reliability. In the second task we use adversarial learning algorithms to generate synthetic data, compare them with the real ones and evaluate their fidelity, utility and privacy using the SDV framework, f-divergences, distinguishability and non-parametric statistical tests. The findings provide stable ML models for intrusion detection and generative models with high fidelity and utility, by combining the Synthetic Data Vault framework, the TRTS and TSTR tests, with non-parametric statistical tests and f-divergence measures.

Data Curation & Synthetic Data Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References36

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation

Related Papers