demucs.cpp icon indicating copy to clipboard operation
demucs.cpp copied to clipboard

Shifts Parameter

Open lpn256 opened this issue 8 months ago • 2 comments

The original implementation included a shifts parameter, which adds N random "shifts" (random starting intervals of silence) to the input audio, does the separation, and then removes the shifts and averages all N tracks together. Could this be done with this implementation?

lpn256 avatar Mar 26 '25 01:03 lpn256

I believe I only implemented the default case of a single shift: https://github.com/sevagh/demucs/blob/release_v4/demucs/apply.py#L187

https://github.com/sevagh/demucs.cpp/blob/main/src/model_apply.cpp#L93-L98

static Eigen::Tensor3dXf
shift_inference(const struct demucscpp::demucs_model &model,
                Eigen::MatrixXf &full_audio, demucscpp::ProgressCallback cb)
{
    // first, apply shifts for time invariance
    // we simply only support shift=1, the demucs default

It would take some new code to support the multiple shift behavior, but it's certainly achievable.

Do you achieve a good improvement in SDR by more than 1 shift?

sevagh avatar Mar 26 '25 11:03 sevagh

there's a bit of an sdr improvement with shifts. sounds a bit clearer to me oftentimes.

lpn256 avatar Apr 11 '25 20:04 lpn256