Search papers, labs, and topics across Lattice.
This paper introduces a system for Music Source Restoration (MSR) that decomposes the problem into separation and restoration stages. The separation stage uses a BandSplit-RoFormer architecture trained with a three-stage curriculum learning approach, progressing from 4-stem to 8-stem separation. The restoration stage employs a HiFi++ GAN, first trained as a generalist and then fine-tuned into instrument-specific experts.
Achieve improved music source restoration by cascading a BandSplit-RoFormer separator with HiFi++ GAN restorers, outperforming direct methods.
Music Source Restoration (MSR) targets recovery of original, unprocessed instrument stems from fully mixed and mastered audio, where production effects and distribution artifacts violate common linear-mixture assumptions. This technical report presents the CP-JKU team's system for the MSR ICASSP Challenge 2025. Our approach decomposes MSR into separation and restoration. First, a single BandSplit-RoFormer separator predicts eight stems plus an auxiliary other stem, and is trained with a three-stage curriculum that progresses from 4-stem warm-start fine-tuning (with LoRA) to 8-stem extension via head expansion. Second, we apply a HiFi++ GAN waveform restorer trained as a generalist and then specialized into eight instrument-specific experts.