SUTDApr 21, 2026arXiv:2604.19438

Malicious ML Model Detection by Learning Dynamic Behaviors

Sara Nambiar, Sarang Nambiar, Dhruv Pradhan, E. Soremekun, Ezekiel Soremekun

AI Summary

This paper introduces DynaHug, a novel approach for detecting malicious pre-trained machine learning models (PTMs) by analyzing their runtime behaviors. DynaHug trains a one-class SVM (OCSVM) on the dynamic behaviors of benign, task-specific models to establish a baseline, then flags deviations as potentially malicious. Experiments on over 25,000 PTMs from Hugging Face and MalHug demonstrate that DynaHug achieves up to a 44% improvement in F1-score compared to existing static, dynamic, and LLM-based detectors.

Key Contribution

Current ML model security scanners miss nearly half of malicious models because they fail to observe runtime behavior, but a new dynamic analysis technique closes this gap.

Abstract

Pre-trained machine learning models (PTMs) are commonly provided via Model Hubs (e.g., Hugging Face) in standard formats like Pickles to facilitate accessibility and reuse. However, this ML supply chain setting is susceptible to malicious attacks that are capable of executing arbitrary code on trusted user environments, e.g., during model loading. To detect malicious PTMs, state-of-the-art detectors (e.g., PickleScan) rely on rules, heuristics, or static analysis, but ignore runtime model behaviors. Consequently, they either miss malicious models due to under-approximation (blacklisting) or miscategorize benign models due to over-approximation (static analysis or whitelisting). To address this challenge, we propose a novel technique (DynaHug) which detects malicious PTMs by learning the behavior of benign PTMs using dynamic analysis and machine learning (ML). DynaHug trains an ML classifier (one-class SVM (OCSVM)) on the runtime behaviours of task-specific benign models. We evaluate DynaHug using over 25,000 benign and malicious PTMs from different sources including Hugging Face and MalHug. We also compare DynaHug to several state-of-the-art detectors including static, dynamic and LLM-based detectors. Results show that DynaHug is up to 44% more effective than existing baselines in terms of F1-score. Our ablation study demonstrates that our design decisions (dynamic analysis, OCSVM, clustering) contribute positively to DynaHug's effectiveness.

Open-Source Models & Weights Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References66

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Malicious ML Model Detection by Learning Dynamic Behaviors

Related Papers