diffusion-posterior-sampling icon indicating copy to clipboard operation
diffusion-posterior-sampling copied to clipboard

Calculating FID

Open mmuckley opened this issue 2 years ago • 3 comments

Hello, thanks for publishing this paper and repo.

I am curious about reproducing the results in the paper. I applied the Gaussian blur model to the first 1,000 images of FFHQ-256 as per Issue #4, but when using torch-fidelity I don't reproduce the FID numbers. If I include torch-fidelity's image resizing, I get 29.3. If I don't include image resizing, I get 37.0. Both of these are pretty far away from the paper value of 44.05.

Could you provide some more details on how to reproduce the numbers of Table 1?

mmuckley avatar Jan 26 '23 20:01 mmuckley

This might not apply to your case, but one discrepancy we found is that authors normalize images to (0, 1) range using the given image's min and max before saving, instead of clipping it to (-1, 1) and then normalizing to (0,1) by adding 1 and dividing by 2.

This can be a problem if the reconstructions have some outliers that are not clipped and therefore the range is skewed. In fact, after reading the labels, the above normalization is applied to the labels as well before saving and therefore the loaded and saved labels are not necessarily equal.

z-fabian avatar Feb 09 '23 01:02 z-fabian

Hey @mmuckley, I'm trying to reproduce table 1 FID scores, and I'm unable to match FFHQ random inpainting results. I'm wondering if I'm missing some preprocessing steps. Here I'm using the FFHQ256 set https://www.kaggle.com/datasets/xhlulu/flickrfaceshq-dataset-nvidia-resized-256px.

neginraoof avatar May 05 '23 19:05 neginraoof

FID might also differ based on whether the reconstructions are compared only to the validation set, or to the training and validation set combined. Typically if compared on less samples FID is much worse.

z-fabian avatar May 09 '23 22:05 z-fabian