Mar 16, 2026arXiv:2603.15905

INSTRUMENTAL: Automatic Synthesizer Parameter Recovery from Audio via Evolutionary Optimization

AI Summary

This paper introduces Instrumental, a system that recovers synthesizer parameters from audio using a differentiable subtractive synthesizer and CMA-ES, a derivative-free evolutionary optimizer. The system optimizes a composite perceptual loss function based on mel-scaled STFT, spectral centroid, and MFCC divergence. Experiments on real audio demonstrate that CMA-ES outperforms gradient descent and that spectral analysis initialization accelerates convergence.

Key Contribution

Recovering synthesizer parameters directly from audio is now possible with Instrumental, a system that combines a differentiable synthesizer with evolutionary optimization, opening new avenues for timbral analysis and manipulation.

Abstract

Existing audio-to-MIDI tools extract notes but discard the timbral characteristics that define an instrument's identity. We present Instrumental, a system that recovers continuous synthesizer parameters from audio by coupling a differentiable 28-parameter subtractive synthesizer with CMA-ES, a derivative-free evolutionary optimizer. We optimize a composite perceptual loss combining mel-scaled STFT, spectral centroid, and MFCC divergence, achieving a matching loss of 2.09 on real recorded audio. We systematically evaluate eight hypotheses for improving convergence and find that only parametric EQ boosting yields meaningful improvement. Our results show that CMA-ES outperforms gradient descent on this non-convex landscape, that more parameters do not monotonically improve matching, and that spectral analysis initialization accelerates convergence over random starts.

Speech & Audio Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References12

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

INSTRUMENTAL: Automatic Synthesizer Parameter Recovery from Audio via Evolutionary Optimization

Related Papers