Erasmus University Medical CenterApr 14, 2026arXiv:2604.12647

Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification

Tsai-Ning Wang, H. D. Dekker, Herman Teun den Dekker, Lin-Lin Chen, Neil Zeghidour, Aaqib Saeed

AI Summary

The paper introduces TRIAGE, a tiered zero-shot framework for respiratory audio classification that adaptively scales test-time compute based on input difficulty. TRIAGE routes audio samples through progressively complex reasoning stages: label-cosine scoring, structured matching with clinician descriptors, and retrieval-augmented LLM reasoning. By using a confidence-based router, TRIAGE achieves a mean AUROC of 0.744 across nine respiratory classification tasks, outperforming prior zero-shot methods and matching supervised baselines on several tasks, while reducing overall compute.

Key Contribution

Adaptive compute allocation in zero-shot learning can match or exceed supervised performance while halving compute costs, even in complex domains like respiratory audio classification.

Abstract

Automated respiratory audio analysis promises scalable, non-invasive disease screening, yet progress is limited by scarce labeled data and costly expert annotation. Zero-shot inference eliminates task-specific supervision, but existing methods apply uniform computation to every input regardless of difficulty. We introduce TRIAGE, a tiered zero-shot framework that adaptively scales test-time compute by routing each audio sample through progressively richer reasoning stages: fast label-cosine scoring in a joint audio-text embedding space (Tier-L), structured matching with clinician-style descriptors (Tier-M), and retrieval-augmented large language model reasoning (Tier-H). A confidence-based router finalizes easy predictions early while allocating additional computation to ambiguous inputs, enabling nearly half of all samples to exit at the cheapest tier. Across nine respiratory classification tasks without task-specific training, TRIAGE achieves a mean AUROC of 0.744, outperforming prior zero-shot methods and matching or exceeding supervised baselines on multiple tasks. Our analysis show that test-time scaling concentrates gains where they matter: uncertain cases see up to 19% relative improvement while confident predictions remain unchanged at minimal cost.

Inference & Quantization Speech & Audio Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification

Related Papers