DNN-based_source_separation
DNN-based_source_separation copied to clipboard
Implementation of D3Net
Reference: "D3Net: Densely connected multidilated DenseNet for music source separation"
I'm not sure of # channels before frequency concatenation. The # of channels depends on the growth rate and # of D2 blocks. I added bottleneck convolution so that both frequency bands have the same channels.
https://github.com/tky823/DNN-based_source_separation/blob/48621f1dbb015454246f9df6e87d20788fc75114/src/models/d3net.py#L107-L109
What needs to be fixed
- [x] multi dilated convolution
- [x] timing of batch normalization
- [x] # of output channels of D2 block
- [x] order of D3 block and downsampling layer in Down D3 block
- [x] upsampling layer
Now, I updated D3Net architecture.
Hello, @tky823. I participate in the Music Demixing Challenge (4th place on leaderboard A). I suggest you write a training script for D3Net and join a team with me.
Hi, @lyghter. I'm now writing the training code. I am not sure if it will be available soon, but I plan to add it.
The challenge will end on July 31st. If you write the training code this month, I will try to train the model and use it in my solution. Sony's nnabla implementation has too slow inference on CPU. It cannot be used in the challenge.
I invite you to join my team and suggest you keep the new code private until the end of the challenge.
I am currently 4th on Leaderboard A and 5th on Leaderboard B. Top-3 from A and top-3 from B will receive prizes.
I'm not sure how my implementation of D3Net will work, so I don't know if I'll be able to participate anytime soon. If I can help, I will join your team. I work on other tasks for about a week. Maybe I will be able to join after that.
Hello @tky823 Take a look at this. It looks like this repo contains pytorch implementation of D3Net and training code. I just found it and haven't tried it yet.
@lyghter
I'm sorry I couldn't help you. Now, I'm sharing the scripts and results in egs/musdb18/d3net.