May 5, 2026arXiv:2605.03914

Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data

Ragib Amin Nihal, Benjamin Yen, Runwu Shi, Takeshi Ashizawa, Kazuhiro Nakadai

AI Summary

The paper demonstrates that independently fine-tuned BEATs encoders for bioacoustics can be composed into a unified multi-species classifier using task vector arithmetic, without the need for centralized data sharing. They find that bioacoustic task vectors exhibit near-orthogonality, correlating with spectral distribution distance and supporting the acoustic niche hypothesis. Furthermore, task arithmetic leads to a redistribution of accuracy, benefiting underrepresented taxa at the expense of species-rich groups, which is useful for equitable biodiversity monitoring.

Key Contribution

Forget federated learning, bioacoustic classifiers can be unified across 661 species by simply averaging independently trained task vectors, unlocking a collaborative, privacy-preserving paradigm.

Abstract

Training data for bioacoustics is scattered across taxa, regions, and institutions. Centralizing it all is often infeasible. We show that independently fine-tuned BEATs encoders can be composed into a unified 661-species classifier via task vector arithmetic without sharing data. We find that bioacoustic task vectors are near-orthogonal (cosine 0.01-0.09). Their separation aligns closely with spectral distribution distance, a gradient consistent with the acoustic niche hypothesis. This geometry makes simple averaging optimal while sign-conflict methods reduce accuracy by one to six percentage points. Composition also creates an asymmetric gap: species-rich groups lose accuracy relative to joint training while underrepresented taxa gain, a redistribution useful for equitable biodiversity monitoring. We verify linear mode connectivity across all taxonomic pairs, demonstrate zero-shot transfer to new regions, and identify domain negation as a boundary condition where composition fails. These results enable a collaborative paradigm for bioacoustics where institutions share only task vectors to assemble multi-taxa classifiers, preserving data privacy.

Data Curation & Synthetic Data Scientific Discovery & Drug Design Speech & Audio

Citation Metrics

Citations0

Influential citations0

References26

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data

Related Papers