spleeter
spleeter copied to clipboard
[Discussion] How to improve track / stem separation ?
Hi all, I usually use the 5stems model, and I would like to know the following
- Why "other" stem contains guitar and piano/keyboards together and the "piano" stem contains only artefacts of piano sound, and my "bass" stem is usually muddy. How to obtain a better separation of those instruments?
2)Would tweaking the musdb_config.json (as in *musdb18 #81" not yield better results than the original?
3)Why are base_config.json used for all stem models? Would using the "finetune" versions not yield better instrument separation.?
4)is there a difference between using musdb or musdb18 ??
I've been trying to wrap my head around this for a while I'm a little confused, Thank you for shedding some light.
Hi @deskstar90, You cannot expect to have better results for the provided pre-trained model than by retraining the model (which is far from trivial). Indeed, the pre-trained models may generate incorrect separation or artefacts But for a few hyper-parameters, tweaking config files will have not impact on the separation, but only on training. So if you don't plan to retrain models, it is not worth tweaking the config. The "finetune" version of the models are only including extra parameters that makes it possible to easily continue training with your own data. So the models only contains more data that is not used during separation, but only if you plan to fine tune the models. We used the musdb18 dataset for training. We may just have used musdb instead of musdb18 in the doc and in config files.
Thank you very much romi!! it's not that evident how to go about it but the more I read the wiki and the issue / discussions about re-training, the more I understand. I'd like to do a before and after shot of the results with the musdb (which I installed) to compare the separation numbers. let me understand, I need to first re-train with the musdb_config.json file and not the stem files? or is this not necessary? Once re-trained, then run the evaluate configuration file, this will give me the results of the separation, Once this completed, then I can run the stem model files? Hoping I got the procedure right..
So, I needed to re-install python / spleeter as my previous install was anaconda and was too convoluted. anyhow it seems to be working fine with the path environments. but before I get to the actual re-training, I tested the 4/5stem & 4/5stem-16kHz models they work fine, but the 4/5stems-finetune models don't this may not make a difference as I read a few places but I'd still like to know if this is on my side? to try to prevent any issues. I basically re-copied an existing 5stem.json and renamed it to 5stem-finetune.json but I get the following.. no checksum..
Ok good news, (kinda..) I think I'm on the right track, was able to train my first model on musdb18 and model checkpoints but I got errors such as "audio file not found" and "error while loading audio" "ffprobe", I did get a confirmation "model training done" can someone please check my file and tell me what I'm doing wrong? a BIG thank you!!
ok, I found my error in the musdb path, it trained successfully now. for evaluation, I got error: please install musdb and museval, (even tho, I do already have musdb installed..) I'll try to re-install anyways..
evaluation, not going according to plans... I'm getting "nan" errors.. tried to look this up and could not find the answer ..anyone know why? any clues?
I Found my issue, I trained on the wrong dataset type.. will report metrics once my re-training and evaluating is completed..
@deskstar90 thanks for all the updates
@tombohub No problem, I'm am not a programmer and this is no easy task but I'm finding my own mistakes as I go along , hopefully this helps someone figure it out as well. But now I'm having issues with training its taking 98% of my CPU resources but no GPU after 23hrs of training and I'm stuck at 3/4 of the way thru musdb dataset. I'd like to train only on rock/pop genre, is it possible to use only half of the dataset? or only that genre(which may/not be enough from the dataset)? any suggestions?
Question: Does the training end when the checkpoint is complete? or does it have to complete training on the whole dataset?
Well, still no go after training 3 times, it just hangs at the same file. The Musdb dataset is 29Gb and my cache is also 29Gb. Not sure if I'm running into some ram issue because of the training processes with tensorflow or missing files or some other bug preventing to complete the training. I get no errors, it just hangs and on two different PC's. I would appreciate some information on this if possible.