music-source-separation icon indicating copy to clipboard operation
music-source-separation copied to clipboard

how to use this?

Open ghost opened this issue 8 years ago • 16 comments

Is this still being uploaded on Github? missing code?

I am wondering if there is any demo script on how to make this work for the training and for the testing phase? Thanks a lot for your time and research!

ghost avatar Jul 09 '17 03:07 ghost

@dankorg It will work if you set right dataset path in config.py. Right before I uploaded MIR-1K dataset which is public to use. You can train the model by running train.py and test it by running eval.py. The result will be shown in Tensorboard audio tab. I'm still updating the code even though it's still working. I need to update README file though.

andabi avatar Jul 09 '17 04:07 andabi

I have set the training dataset to DATA_PATH = 'dataset/mir-1k/Wavfile'. But I got this error: Traceback (most recent call last): File "train.py", line 91, in <module> train() File "train.py", line 48, in train mixed_wav, src1_wav, src2_wav = data.next_wavs(TrainConfig.SECONDS, 1) File "/home/kelsey/music-source-separation/data.py", line 24, in next_wavs src1, src2 = get_src1_src2_wav(wavfiles, sec, ModelConfig.SR) File "/home/kelsey/music-source-separation/preprocess.py", line 36, in get_src1_src2_wav return wav[:, 0], wav[:, 1] IndexError: too many indices for array

ShengleiH avatar Jul 12 '17 11:07 ShengleiH

@ShengleiH please check the version of librosa and let me know. I used 0.5.1.

andabi avatar Jul 12 '17 12:07 andabi

My version of librosa is also 0.5.1. The file librosa open is 'dataset/mir-1k/Wavfile/leon_8_06.wav" and the shape of the wave file opened by librosa is ().

filenames = ['dataset/mir-1k/Wavfile/leon_8_06.wav'] wav shape = () Traceback (most recent call last): File "train.py", line 91, in <module> train() File "train.py", line 48, in train mixed_wav, src1_wav, src2_wav = data.next_wavs(TrainConfig.SECONDS, 1) File "/home/kelsey/music-source-separation/data.py", line 24, in next_wavs src1, src2 = get_src1_src2_wav(wavfiles, sec, ModelConfig.SR) File "/home/kelsey/music-source-separation/preprocess.py", line 36, in get_src1_src2_wav return wav[:, 0], wav[:, 1] IndexError: too many indices for array

ShengleiH avatar Jul 13 '17 02:07 ShengleiH

@andabi I found the reason. Because my python version is 3.6, in which the 'map()' function has changed. After wrapping the 'map' function with 'list()' in get_src1_src2_wav(), then this error disappeared.

ShengleiH avatar Jul 13 '17 02:07 ShengleiH

I managed to train the network, but how do I now test this on a single .wav file? Does the mixed wav have to be mono or stereo?

ghost avatar Jul 14 '17 08:07 ghost

@dankorg please use eval.py to test. set NUM_EVAL (the number of test you want) and DATA_PATH in config.py. you can use either mono or stereo.

andabi avatar Jul 14 '17 08:07 andabi

I put num_eval = 1 because I just want to test one for now. The file is in stereo 16kHz 16bit and in correct data_path in the config.py. I run python eval.py and I get nothing in the results folder. What is going on? No error is shown in the python output, just run and back to command line, strange.

ghost avatar Jul 15 '17 05:07 ghost

@dankorg the result will be shown in Tensorboard audio tab. You can play the result wav directly in there :)

andabi avatar Jul 15 '17 05:07 andabi

OK so everything is working in tensorboard, I get audio and graphs data. My last question is which part of the code do I modify (and how) to save the mixture and separate components automatically as .wav in the workspace ? For example how to save the .wav files to the results dir maybe? Thanks a lot!

ghost avatar Jul 16 '17 09:07 ghost

@dankorg you have two options.

  1. I just updated the code and now the result is written to result_path. set WRITE_RESULT to True to write result wavs.
  2. download the result wavs on Tensorboard.

andabi avatar Jul 16 '17 13:07 andabi

Thank you very much! Fantastic work, can't wait to see what you guys come up with next! :)

ghost avatar Jul 16 '17 20:07 ghost

Hello and thanks a lot for this amazing tool!

I've trained the model with the MIR-1K dataset as you suggested, doing all the necessary modifications in the config file. I am now evaluating it using .wav files. The result of the analysis are 3 wav files: music, original, voice. My problem is that, while in the music file the voice is not so dominant, in the voice file I hear very few differences with respect to the original version. Am I missing something here regarding format/sample_rate or something else?
Thanks a lot in advance for your help!

shoegazerstella avatar Oct 24 '18 08:10 shoegazerstella

@shoegazerstella I have the same problem as you. Did you find the solution? ... please tell me.

apprentissagee avatar May 28 '19 13:05 apprentissagee

hey @apprentissagee As I did not have so much time for re-training a network, I ended up using this this other repo. Super recommended.

shoegazerstella avatar May 28 '19 14:05 shoegazerstella

hey @shoegazerstella Thank you for your answer .

  1. Tell me please you trained the model or you used "pre-trained models" ?
  2. You worked with MUSDB18 data or CCMixter ? I am really interested...

apprentissagee avatar May 28 '19 14:05 apprentissagee