Search papers, labs, and topics across Lattice.
This paper investigates the internal representations of a general-purpose audio self-supervised learning (SSL) model at the neuron level using mechanistic interpretability techniques. By analyzing conditional activation patterns, the authors identify class-specific neurons that exhibit shared responses across semantically and acoustically similar categories. They demonstrate that these neurons provide broad coverage across novel tasks and have a functional impact on classification performance, offering insights into the model's generalization capabilities.
Audio SSL models contain class-specific neurons that generalize surprisingly well across diverse tasks and respond to shared semantic and acoustic features, offering a glimpse into their robust generalization abilities.
In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.