Search papers, labs, and topics across Lattice.
This paper introduces a static analysis pipeline leveraging graph kernels and binary capability vectors to attribute Android residential proxy malware to specific proxy networks. The approach extracts control-flow and function-call graphs from a dataset of 3,365 labeled proxy apps across four networks. Experiments using SGD with 5-fold DEX-grouped cross-validation achieve a macro F1 score of 0.985, demonstrating high accuracy in family attribution.
Achieve near-perfect attribution of Android residential proxy malware by fusing graph kernel features with binary capabilities, even amidst code reuse and obfuscation.
Android residential proxy applications represent a growing class of potentially-unwanted programs (PUPs) that covertly route third-party traffic through end-user devices, enabling ad fraud, credential abuse, and evasion of geolocation controls by sophisticated threat actors. Attributing an unknown APK to a specific proxy network remains challenging due to code reuse, SDK embedding, and obfuscation across proxy families. We present a static-analysis pipeline for automated proxyware family attribution, extracting graph-structured representations (control-flow and function-call graphs) and behavioral signatures from a labeled corpus of 3,365 Android proxy apps spanning four commercial proxy networks. We evaluate Weisfeiler-Lehman graph kernel features alone and fused with binary capability vectors across multiple classifiers. Using 5-fold DEX-grouped cross-validation to prevent data leakage, SGD achieves a macro F1 of 0.985 on the expanded dataset. To support explainability, we map classifier decisions to automatically generated Yara rules, achieving per-family accuracies up to 88.45\% after filtering non-discriminative signatures. Finally, we discuss these results in the context of the broader ecosystem. We find that from the expanded dataset, the majority of applications (51.4\%) still available through APKPure still contain embedded proxy SDK code. Further analysis of developer accounts reveals that 23 developers are responsible for other applications also containing such functionality, suggesting continuous and ongoing commercial relationships between proxy providers and developers.