diffusion-point-cloud
diffusion-point-cloud copied to clipboard
Met Nan after 150000 Iter
Hello,
Thanks for your work.
May I ask, have you met this problem in your training with train_gen.py
[lgan_mmd-CD] nan
[lgan_cov-CD] 0.24250001
[lgan_mmd_smp-CD] nan
Traceback (most recent call last):
File "train_gen.py", line 222, in
I think the training will not stop until we manually stop it because iter is set to inf. However it failed to generate samples using 150000.pt
Best Wishes
Hello, I have encountered the same problem, may I ask if you have solved the reason why NaN occurs?
Usually NaN errors occur during model training due to values getting so small they become 0. Dividing by that 0 is no good. My guess is that your model is training so long, some value somewhere becomes 0 and breaks the training.