LiviaNET
LiviaNET copied to clipboard
A problem about NAN
Hi josedolz, So sorry to bother you. I am trying to use your network to do some experiment. But I find that when I train my own model. The cost begin to become NAN after one train step. I really don't know why. Did you meet the same problem before? The parameter of the network is set as your paper's first architecture.(kernel 777, input 272727) And I change the code to fit the Python3.
same problem. I set LiviaNet_Config.ini
n_classes = 5
number Of Epochs = 10
number Of SubEpochs = 1000
imageTypes = 0
# custom dataset, no ROI
# other setting is the same

I think it is because imagesSamplesAll=0 in src/LiviaNet/startTraining.py.
And it cause numberBatches=0, so the model stop training.
https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L162
I still try to figure out why getSamplesSubepoch() will output 0
https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L138
Hi, it is a long time since last time I used this code, but I recall that this comes from the fact that there are 0 samples to process (typically from a wrong path pointing to the image files). Check whether the files can be found and loaded, and then why the number of samples is equal to 0.
Best,