demucs.cpp
demucs.cpp copied to clipboard
Shifts Parameter
The original implementation included a shifts parameter, which adds N random "shifts" (random starting intervals of silence) to the input audio, does the separation, and then removes the shifts and averages all N tracks together. Could this be done with this implementation?
I believe I only implemented the default case of a single shift: https://github.com/sevagh/demucs/blob/release_v4/demucs/apply.py#L187
https://github.com/sevagh/demucs.cpp/blob/main/src/model_apply.cpp#L93-L98
static Eigen::Tensor3dXf
shift_inference(const struct demucscpp::demucs_model &model,
Eigen::MatrixXf &full_audio, demucscpp::ProgressCallback cb)
{
// first, apply shifts for time invariance
// we simply only support shift=1, the demucs default
It would take some new code to support the multiple shift behavior, but it's certainly achievable.
Do you achieve a good improvement in SDR by more than 1 shift?
there's a bit of an sdr improvement with shifts. sounds a bit clearer to me oftentimes.