Jun 18, 2026arXiv:2606.19993

Activation- and Influence-Aware Ranks (AIR): Function-Preserving SVD Compression for LLMs

Nico Harder, Daniel Becking, Karsten Mueller, Wojciech Samek

AI Summary

The paper introduces Activation- and Influence-Aware Ranks (AIR), a novel SVD-based compression framework for large language models (LLMs) that optimizes low-rank approximations using a backward-signal influence metric. This method achieves significant improvements in perplexity, surpassing existing techniques like ACIP while maintaining model quality with drastically reduced calibration data. Notably, AIR demonstrates over 18% better perplexity at 60% parameter retention, translating to substantial gains in computational efficiency across FLOP, peak memory, and per-token latency.

Key Contribution

AIR achieves over 18% better perplexity than previous methods while retaining 60% of the parameters, revolutionizing LLM compression efficiency.

Abstract

We present Activation- and Influence-Aware Ranks (AIR), an SVD-based LLM compression framework that guides each weight matrix's low-rank approximation with a backward-signal influence metric. Starting from the activation-aware optimum of SVD-LLM(W), AIR runs a single closed-form alternating least squares (ALS) sweep that integrates influence element-wise under a monotone-descent guarantee. AIR is layer-local and composes orthogonally with end-to-end methods: alone it exceeds ACIP, and AIR+LoRA outperforms it further. AIR improves perplexity over SVD-LLM(W) by >18% at <=60% parameter retention, matches its quality with ~90% less calibration data, and turns parameter savings into FLOP, peak-memory, and per-token latency gains.

Inference & Quantization Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Activation- and Influence-Aware Ranks (AIR): Function-Preserving SVD Compression for LLMs

Related Papers