BigGAN-PyTorch
BigGAN-PyTorch copied to clipboard
FID is nan
Hi guys, thank you for the amazing work!
I have a trouble with FID. During the training, many times the log shows FID is nan. It does not happen all the time but more than a half of them.
So I wonder in which cases FID can be nan? And what can I do to prevent it?
Thank you very much!
Hi @doantientai, the FID value is a sort of difference between two Gaussians and comes from Frechet Distance computation. The reason for NaN FID value could be small number of samples (<2048). You can see two different frechet distance computation methods in inception_utils.py . As default it uses the 'torch' method. So try the 'numpy' method instead of 'torch'. I hope it works.
Thank you @damlasena, I switched to numpy method and it works. I also think that the line 306 in inception_utils.py should be:
FID = numpy_calculate_frechet_distance(mu, sigma, data_mu, data_sigma)
instead of:
FID = numpy_calculate_frechet_distance(mu.cpu().numpy(), sigma.cpu().numpy(), data_mu, data_sigma)
because in the line 299, mu and sigma are already converted to numpy arrays
Hi, @doantientai @damlasena. Is there anyone who finds that numpy_calculate_frechet_distance will cost a lot of time? For example, when I want to vilify the model performance, it cost me almost 1 day or more to calculate the metric. I don't know why I cost so much time! So, someone who can help me fix this problem? Thanks!
BTW, I use the V100 32GB device with 8 CPUs