Search papers, labs, and topics across Lattice.
LabelBuddy, an open-source audio annotation tool, was developed to address the scarcity of infrastructure for capturing subjective nuances in music information retrieval (MIR). It decouples the interface from inference using containerized backends, enabling users to integrate custom models for AI-assisted pre-annotation. The system supports multi-user consensus and model isolation, facilitating the development of richer, human-aligned representation learning for LALMs and AI agents.
Unleash the power of AI-assisted audio annotation with LabelBuddy, the open-source tool that lets you plug in your own models and build richer, more nuanced music representations.
The advancement of Machine learning (ML), Large Audio Language Models (LALMs), and autonomous AI agents in Music Information Retrieval (MIR) necessitates a shift from static tagging to rich, human-aligned representation learning. However, the scarcity of open-source infrastructure capable of capturing the subjective nuances of audio annotation remains a critical bottleneck. This paper introduces \textbf{LabelBuddy}, an open-source collaborative auto-tagging audio annotation tool designed to bridge the gap between human intent and machine understanding. Unlike static tools, it decouples the interface from inference via containerized backends, allowing users to plug in custom models for AI-assisted pre-annotation. We describe the system architecture, which supports multi-user consensus, containerized model isolation, and a roadmap for extending agents and LALMs. Code available at https://github.com/GiannisProkopiou/gsoc2022-Label-buddy.