Genre-Sample-Classifier
Genre-Sample-Classifier copied to clipboard
[ISSUE] Genre Classifier returns NaN losses
I've spent an unhealthy amount of hours figuring this out and so far I can only get NaN (Not a Number) losses which seem to be making things not work properly.
Epoch 00001: saving model to savedweights_1/chkp-0.47580644488334656.ckpt 105/105 [==============================] - 1s 10ms/step - loss: nan - accuracy: 0.4797 - val_loss: nan - val_accuracy: 0.4758 Epoch 2/40 103/105 [============================>.] - ETA: 0s - loss: nan - accuracy: 0.4815 Epoch 00002: saving model to savedweights_1/chkp-0.47580644488334656.ckpt 105/105 [==============================] - 0s 5ms/step - loss: nan - accuracy: 0.4812 - val_loss: nan - val_accuracy: 0.4758
I've tried different sorts of training datasets. Lots of files, few files. Long files, short files. Normalization, silence cutting. Not sure what is wrong.
The graph of Accuracy and Loss of course goes like this:
What dataset and how classes are you using? I used webm audio files (with a samplerate of 48khz) and downloaded them from youtube using https://ymp4.download/en1/
I would sometimes get this issue when training on non-normalized data, or if the output neurons weren't equal to the actual amount of classes.
I tried to make my own classes in attempt of recognizing a certain artist (like trying to figure out ghost producers) vs tracks of the same genres and others. (i.e Above & Beyond, Trance, Others and Drum & Bass)
I tried to modify the code to accept other files besides webm. I noticed it only worked with 48Khz... I then tried multiple wavs, a single wav with a lot of music in it. Nothing. WebMs made with ffmpeg using libopus at 160kbps (should be the same as youtube). Nothing. Huge WebMs made the code run out of memory (4GB limitation) so I split them into 1h WebMs, didn't work either.
No lies, I spent like the whole day trying to get it to work but unfortunately I got NaN losses all the time. I may give it a shot again using that website and see how it goes.
Maybe I didn't give enough data to it? I gave about 3-7 hours for each class, do I really need... uhh idk, 100?
You shouldn't need too many hours for each class. Even 1 hour or less should work. Technically you don't need to use 48khz files but it will result in a different MFCC size so I didn't bother figuring out the expected mfcc size when creating an empty np array (The only reason to do this is because appending to the same array ~200k times is super slow).
All I can think of is to make sure you have the labels right for each class, so if you have 7 classes, the labels should be 0-6.
Here's my dubstep v classical array (just 2 classes): https://we.tl/t-geHhxhqjR2 Feel free to try it out. I will try with wav and mp3 myself and see that works for me.
Just tried it on two mp3 files, 1 one each class and each 1 hour long. Worked fine for me, except I had to change the mfcc length. But it worked exactly as intended, no problems over here!
Maybe send me your code and files too if you can?