LiviaNET icon indicating copy to clipboard operation
LiviaNET copied to clipboard

A problem about NAN

Open Simon1zxb opened this issue 7 years ago • 2 comments

Hi josedolz, So sorry to bother you. I am trying to use your network to do some experiment. But I find that when I train my own model. The cost begin to become NAN after one train step. I really don't know why. Did you meet the same problem before? The parameter of the network is set as your paper's first architecture.(kernel 777, input 272727) And I change the code to fit the Python3.

Simon1zxb avatar Jul 30 '18 21:07 Simon1zxb

same problem. I set LiviaNet_Config.ini

n_classes = 5
number Of Epochs = 10
number Of SubEpochs = 1000
imageTypes = 0
# custom dataset, no ROI
# other setting is the same

image

I think it is because imagesSamplesAll=0 in src/LiviaNet/startTraining.py. And it cause numberBatches=0, so the model stop training.

https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L162

I still try to figure out why getSamplesSubepoch() will output 0 https://github.com/josedolz/LiviaNET/blob/ac01462623e5a8a49f9f5ad172d4ec76681c6b80/src/LiviaNet/startTraining.py#L129-L138

ytliang97 avatar Jun 13 '20 22:06 ytliang97

Hi, it is a long time since last time I used this code, but I recall that this comes from the fact that there are 0 samples to process (typically from a wrong path pointing to the image files). Check whether the files can be found and loaded, and then why the number of samples is equal to 0.

Best,

josedolz avatar Jun 24 '20 21:06 josedolz