JKUMar 4, 2026arXiv:2603.04032

Multi-Stage Music Source Restoration with BandSplit-RoFormer Separation and HiFi++ GAN

T. Morocutti, Tobias Morocutti, E. Karystinaios, Emmanouil Karystinaios, Jonathan Greif, Gerhard Widmer

AI Summary

This paper introduces a system for Music Source Restoration (MSR) that decomposes the problem into separation and restoration stages. The separation stage uses a BandSplit-RoFormer architecture trained with a three-stage curriculum learning approach, progressing from 4-stem to 8-stem separation. The restoration stage employs a HiFi++ GAN, first trained as a generalist and then fine-tuned into instrument-specific experts.

Key Contribution

Achieve improved music source restoration by cascading a BandSplit-RoFormer separator with HiFi++ GAN restorers, outperforming direct methods.

Abstract

Music Source Restoration (MSR) targets recovery of original, unprocessed instrument stems from fully mixed and mastered audio, where production effects and distribution artifacts violate common linear-mixture assumptions. This technical report presents the CP-JKU team's system for the MSR ICASSP Challenge 2025. Our approach decomposes MSR into separation and restoration. First, a single BandSplit-RoFormer separator predicts eight stems plus an auxiliary other stem, and is trained with a three-stage curriculum that progresses from 4-stem warm-start fine-tuning (with LoRA) to 8-stem extension via head expansion. Second, we apply a HiFi++ GAN waveform restorer trained as a generalist and then specialized into eight instrument-specific experts.

Architecture Design (Transformers, SSMs, MoE)Speech & Audio

Citation Metrics

Citations0

Influential citations0

References12

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multi-Stage Music Source Restoration with BandSplit-RoFormer Separation and HiFi++ GAN

Related Papers