CHDApr 14, 2026arXiv:2604.12527

Audio-Cogito: Towards Deep Audio Reasoning in Large Audio Language Models

Longhao Li, Hongjie Chen, Zehan Li, Qihan Hu, Jian Kang, Jie Li, Yongxiang Li

AI Summary

Audio-Cogito is introduced as an open-source solution to improve deep audio reasoning in Large Audio Language Models (LALMs). A new dataset, curated using Cogito-pipe, provides 545k high-quality audio reasoning samples. Fine-tuning with a self-distillation strategy allows Audio-Cogito to achieve state-of-the-art performance among open-source models on the MMAR benchmark and top-tier results in the Interspeech 2024 Audio Reasoning Challenge.

Key Contribution

Open-source audio reasoning just leveled up: Audio-Cogito rivals closed-source models on complex audio tasks.

Abstract

Recent advances in reasoning models have driven significant progress in text and multimodal domains, yet audio reasoning remains relatively limited. Only a few Large Audio Language Models (LALMs) incorporate explicit Chain-of-Thought (CoT) reasoning, and their capabilities are often inconsistent and insufficient for complex tasks. To bridge this gap, we introduce Audio-Cogito, a fully open-source solution for deep audio reasoning. We develop Cogito-pipe for high-quality audio reasoning data curation, producing 545k reasoning samples that will be released after review. Based on this dataset, we adopt a self-distillation strategy for model fine-tuning. Experiments on the MMAR benchmark, the only audio benchmark evaluating the CoT process, show that our model achieves the best performance among open-source models and matches or surpasses certain closed-source models in specific metrics. Our approach also ranks among the top-tier systems in the Interspeech 2026 Audio Reasoning Challenge.

Multimodal Models Reasoning & Chain-of-Thought Speech & Audio

Citation Metrics

Citations0

Influential citations0

References46

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Audio-Cogito: Towards Deep Audio Reasoning in Large Audio Language Models

Related Papers