mixxx icon indicating copy to clipboard operation
mixxx copied to clipboard

Add spleeter effects

Open katsar0v opened this issue 3 years ago • 15 comments

This is just a draft PR. I haven't succeeded integrating Spleeterpp yet, so I hope someone can help me out

  • [ ] Integrate Spleeterpp
  • [ ] Add effects separating the stems of the current signal

katsar0v avatar Jul 18 '20 09:07 katsar0v

https://github.com/katsar0v/mixxx/pull/2

Holzhaus avatar Jul 28 '20 10:07 Holzhaus

@katsar0v any news ?

poelzi avatar Oct 12 '20 23:10 poelzi

Yeah, I also hope this get's picked up again soon. It would be great to see Spleeter support in Mixxx.

Holzhaus avatar Oct 13 '20 09:10 Holzhaus

This PR is marked as stale because it has been open 90 days with no activity.

github-actions[bot] avatar Jan 22 '21 00:01 github-actions[bot]

Hey everyone,

A few months ago I was able to freeze the graph for the 5 stems model of the first public release of Spleeter. This is a bit obsolete by now, Tensorflow uses SavedModel nowadays.

Still, I consider it an accomplishment (it was kinda hard to do) and I consider that it is viable to integrate Spleeter and Mixxx. Actually, the integration is the easy part in my opinion, the real challenge for me would be the UI/UX re-design, since Mixxx adheres to a one track per deck paradigm.

If there's still interest in integrating Spleeter and Mixxx I'd be happy to work on it, but I think we should take an inverse approach, leaving the integration proper to do as a last step, and working first on getting Mixxx to work nicely with stems.

Any thoughts?

geraldog avatar May 08 '22 21:05 geraldog

Real STEMs support as in Traktor would be fantastic, but isn't this PR is about an effect, which seperates one stem. For this limited use case the effect UI would be sufficient.

JoergAtGithub avatar May 08 '22 21:05 JoergAtGithub

@JoergAtGithub with all due respect, separating one stem only would be next to useless. If you need one stem only better to use Spleeter in the preparation phase of your set, then load that regular wav file as you would any other audio file.

One other consideration, is that AFAIK Tensorflow doesn't output the Spleeter samples one by one or in batches. You have to run the separation for the entire audio file. So, not very easy to add as an "effect"

geraldog avatar May 08 '22 22:05 geraldog

I took a brief look at https://www.virtualdj.com/stems/ and it seems that Virtual DJ (which also is using Spleeter pre-trained models under the hood) uses the EQs knobs in the controller for controlling stem volume finely plus the performance pads for turning a particular stem on/off.

This workflow seems particularly suited for general use and requires minimal UI/UX re-design.

geraldog avatar May 09 '22 03:05 geraldog

For STEM support I see two major routes that have different requirements.

  1. If the algorithm that separates the stems is "realtime-capable" (can be fed small audio buffers at a time and process them faster than the buffer size), integrating audio splitting could be done as an effect. That effect could then have 2-5 knobs (depending on into how many tracks it can separate) that act as volume knobs for each channel.
  2. If the effect is not realtime-capable the solution would be much more invasive. It would not only require audio-engine changes, but also UI changes, library changes, etc. The actual integration with spleeter could be done as the last step, as the first step would be the hardwork of implementing "proper" stem support.

Also, we have to take processing power into account. I don't have much experience with spleeter, but comparing a bunch of different demixing AIs, the one that sounds best currently is facebook demucs. Unfortunately, I don't know if that is technically realtime-capable. It also takes quite a lot of processing power (it pinned my i5-10210U to 100% for 8 minutes, while trying to extract 4 channels from a 2 minute track). Hammering the CPU like that during realtime-critical operations (mixxx audio processing) is dangerous because it can cause buffer underruns (resulting in audible crackling and clicks) depending on your system configuration.

Swiftb0y avatar May 09 '22 11:05 Swiftb0y

Thank you for the feedback @Swiftb0y

In the quest to integrate Spleeter and Mixxx I've now hacked a quick and dirty solution to inference the Spleeter Tensorflow model from C++

The frozen model is 188M and can't be attached here. Download it from http://amatriz.net/DISCOGS/spleeter_5_stems_frozen_final.pb

Had issues with markdown not rendering the code correctly. Download code from http://amatriz.net/DISCOGS/spleeter_clean.cxx

Compile with g++ -I/usr/include/tensorflow -o spleetermixxx spleeter_clean.cxx -ltensorflow_cc -lsndfile

Tested on Linux only.

geraldog avatar May 09 '22 23:05 geraldog

OK, so I tried integrating that working code into a real-time JACK app and failed miserably. All I heard was a kling-klang symphony of underruns, even though my 12 cores were almost at 100% the whole time.

I then tried the "online" mode of spleeterpp with better results. Indeed, processor usage is closer to idle. But I was only able to avoid underruns by setting a JACK period size of 8192 frames, that is, with a very high latency.

EDIT: a period size of 4096 frames with 2 periods seems to work fine. That still amounts to 186msecs of latency with a sample rate of 44100Hz...

geraldog avatar May 11 '22 00:05 geraldog

I'm interested in this feature, and potentially helping out with integration. Spleeter README claims it can run 4 stems 100x faster-than-realtime on a GPU. I think what other DJ software does is compute all the stems when the file is loaded, and then cache them somewhere.

dfl avatar Jun 13 '22 23:06 dfl

See https://github.com/mixxxdj/mixxx/pull/4760

JoergAtGithub avatar Jun 13 '22 23:06 JoergAtGithub

I'm interested in this feature, and potentially helping out with integration. Spleeter README claims it can run 4 stems 100x faster-than-realtime on a GPU. I think what other DJ software does is compute all the stems when the file is loaded, and then cache them somewhere.

If you are really interested in helping add Stems to Mixxx work on Stems first please, the architecture of the Stem playing etc.

The AI part is mostly boilerplate code. I'm personally using dlprimitives lately, training a experimental Demucs v2 net with many features turned off for compatibility, but @Swiftb0y may be on the right track by saying ONNX Runtime is what we could get away with in Mixxx

By the way, the fact Spleeter is very fast makes CPU inference very feasible if all fails, unlike Demucs, which is very-much GPU only. If only we could get both and had the DJ choose what to use!

geraldog avatar Jun 30 '22 02:06 geraldog

https://github.com/bigcash/spleeter-pytorch-mnn I haven't tested yet but supposedly this could help convert Spleeter from Tensorflow to ONNX

geraldog avatar Jul 02 '22 03:07 geraldog