generative-inpainting-pytorch icon indicating copy to clipboard operation
generative-inpainting-pytorch copied to clipboard

an accidental bug:loss=NaN

Open kinfeparty opened this issue 4 years ago • 2 comments

Hello! I've met a bug which is hard to solve. I've done many modification on your proposed code, and everything is fine. Last week I do a modification on the original code in a new dataset, and I run the proposed code as a baseline. The original code works fine. Bug the modified code met this bug. 7a6b0061d87e49ea4ffa04d9f9299cc The dataset is not corrupted. And no matter how I check the code and datset

, the loss is NaN when iter<10000. Which the strange thing is when I re-run the original code, the same bug happened. But when I read the last week's original code, the training stage is all fine. }MKQ H4)QS1H3QXR03GC7GU

Can you run the original code ? I don't know why the loss =NaN. Can you help me solve this bug? It makes me crazy.

This is my config.yaml ,which is almost similar with yours. AM5S98QRH7KBXOVD)SK0P3H

kinfeparty avatar May 01 '20 10:05 kinfeparty

It seems that there is a bug in your data or your custom dataset. You may try to assert there is no NaN or Inf in your data. I don't know if "the dataset is not corrupted" you mentioned refers to that. You may debug the code to find where the NaN or Inf first appears.

daa233 avatar May 05 '20 14:05 daa233

It seems that there is a bug in your data or your custom dataset. You may try to assert there is no NaN or Inf in your data. I don't know if "the dataset is not corrupted" you mentioned refers to that. You may debug the code to find where the NaN or Inf first appears.

Thanks for you reply. The bug is strange. My dataset has no bug. I run the model again and again. The bug doesn't exist.

kinfeparty avatar May 05 '20 14:05 kinfeparty