Zero-DCE icon indicating copy to clipboard operation
Zero-DCE copied to clipboard

why the training loss is nan?

Open senlin-ali opened this issue 3 years ago • 3 comments

hi , i have a question about training. when i train my own data, the loss begins 1.0 to 0.8... and then the loss is nan, how can i solve this question?

senlin-ali avatar Nov 11 '21 09:11 senlin-ali

I also have the same problem

WEIZHIHONG720 avatar Feb 04 '22 03:02 WEIZHIHONG720

hi , i have a question about training. when i train my own data, the loss begins 1.0 to 0.8... and then the loss is nan, how can i solve this question?

I also have the same problem. My own dataset only contains lowlight images. If it is necessary to contain both lowlight images and normal images in the train dataset?

Chzzi avatar Mar 26 '22 09:03 Chzzi

I encountered the same problem, and each time it occurred in a specific iteration of the first epoch. At this training iteration, all four components of Loss become nan. I tried to place the image of that specific iteration in the first batch of training, but there was no loss=nan condition in the first batch. Therefore, I guess this phenomenon has nothing to do with my data.

For better judgment, the training hyperparameters I use are all default values. Another attempt was made to load the pretraining model, but loss=nan still occurred at a specific iteration.

Do you have any good suggestions? If you need other training details, you can reply to me.

We look forward to your reply! thank you! @Li-Chongyi

cwzzzzz avatar Mar 25 '23 08:03 cwzzzzz