KAISTApr 7, 2026arXiv:2604.05724

Beyond Semantics: Disentangling Information Scope in Sparse Autoencoders for CLIP

AI Summary

This paper introduces "information scope" as a novel dimension for interpreting sparse autoencoder (SAE) features learned on CLIP vision encoders, going beyond semantic meaning. They define information scope as the degree to which an SAE feature aggregates visual evidence, ranging from localized to global signals. The authors propose the Contextual Dependency Score (CDS) to quantify this scope, revealing that features with different CDS values systematically influence CLIP's predictions and confidence differently.

Key Contribution

Sparse autoencoder features in CLIP aren't just about *what* they represent, but *where* they look, and that "information scope" fundamentally shapes how they influence CLIP's decisions.

Abstract

Sparse Autoencoders (SAEs) have emerged as a powerful tool for interpreting the internal representations of CLIP vision encoders, yet existing analyses largely focus on the semantic meaning of individual features. We introduce information scope as a complementary dimension of interpretability that characterizes how broadly an SAE feature aggregates visual evidence, ranging from localized, patch-specific cues to global, image-level signals. We observe that some SAE features respond consistently across spatial perturbations, while others shift unpredictably with minor input changes, indicating a fundamental distinction in their underlying scope. To quantify this, we propose the Contextual Dependency Score (CDS), which separates positionally stable local scope features from positionally variant global scope features. Our experiments show that features of different information scopes exert systematically different influences on CLIP's predictions and confidence. These findings establish information scope as a critical new axis for understanding CLIP representations and provide a deeper diagnostic view of SAE-derived features.

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Beyond Semantics: Disentangling Information Scope in Sparse Autoencoders for CLIP

Related Papers