W. Zai El Amri O. Tautz H. Ritter A. Melnik

Abstract
In this work, we demonstrate how a publicly available, pre-trained Jukebox model can be adapted for the problem of audio source separation from a single mixed audio channel. Our neural network architecture, which is using transfer learning, is quick to train and the results demonstrate performance comparable to other state-of-the-art approaches that require a lot more compute resources, training data, and time. We provide an open-source code implementation of our architecture (https://github.com/wzaielamri/unmix)
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| music-source-separation-on-musdb18-hq | Unmix | SDR (avg): 4.188 SDR (bass): 4.073 SDR (drums): 4.925 SDR (others): 2.695 SDR (vocals): 5.06 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.