improved-diffusion Can't get the same FID as the checkpoint on cifar-10

Thanks for your great work. I try to train a diffusion model on cifar-10. I used the following parameters:

MODEL_FLAGS="--image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3"
DIFFUSION_FLAGS="--diffusion_steps 4000 --noise_schedule cosine"
TRAIN_FLAGS="--lr 1e-4 --batch_size 128"

During the test phase, I sample 50k Image with 100 diffuison steps and compute FID with the training set. However, I get FID=6.7 which higher than the FID (3.7) obtained by checkpoint

How can I get the same FID?

Jun 14 '22 03:06 z562

Any updates @z562 ?

Oct 03 '22 23:10 je-santos

Any updates @z562 ?

I tried other settings, their FID is around 4.0 but failed to reach the results in the paper. :(

Oct 10 '22 07:10 z562

@z562 Can you share your FID calculation code?

Oct 12 '22 20:10 je-santos

@z562 Can you share your FID calculation code?

https://github.com/openai/guided-diffusion/tree/main/evaluations

Oct 13 '22 06:10 z562

@z562 I also get a similar FID of around 7 under the given settings. Can you share your new settings when FID is around 4.0?

Oct 28 '22 02:10 LuoBingjun

@z562 I also get a similar FID of around 7 under the given settings. Can you share your new settings when FID is around 4.0?

I use the --dropout 0.1 and --noise_schedule linear

Oct 28 '22 03:10 z562

Stupid question here: Is the FID computed in the test or in the training set?

Jan 07 '23 00:01 je-santos

I tried using mseitzer/pytorch-fid and got only 11.379320395008278.

I sampled 50k images using

python scripts/image_sample.py --model_path cifar10_uncond_vlb_50M_500K.pt --image_size 32 --num_channels 128 --num_res_blocks 3 --learn_sigma True --dropout 0.3 --diffusion_steps 4000 --noise_schedule cosine --use_kl True --batch_size 1000 --num_samples 50000

and fid was computed against the 50k training images.

Feb 12 '23 05:02 njuaplusplus

I got 47.87 using the following settings:

--image_size 32 --num_channels 128 --num_res_blocks 3 --diffusion_steps 1000 --noise_schedule linear --lr 1e-4 --batch_size 64

I trained the model on 4 GPUs for 500K iterations as denoted by Table 2 in the paper where the reported FID was 3.29. Could anyone give me some ideas on how to train a model for the first setting in Table 2?

Feb 20 '23 19:02 njuaplusplus

@njuaplusplus did you figure it out how to get better FID? mine is also around 50 which seems ridiculous number. And I trained also for more than 500k steps. Not sure what might be so different to have such big mismatch than a paper

Jul 20 '23 18:07 akrlowicz

Why is it that I used the same operation as you again, but the FID result is over 60, can you give me the source code of your FID calculation?

Aug 20 '23 19:08 zccz6

Why is it that I used the same operation as you again, but the FID result is over 60, can you give me the source code of your FID calculation?

The sample.py file got the [0,256] npz file, which I then reduced to the same [-1,1] file as data_npz/cifar10.test.npz using images = (images.astype(np.float32)/ 127.5) - 1.0. Then I calculate their mean and variance with incention network and finally with FID formula, can you tell me what went wrong? Thank you very much!

Aug 20 '23 19:08 zccz6

In my case hyperparameters were the case. I tested a few different implementations of FID - even though the metric value can differ, FID of 50 is too big. I missed the thing where the number of steps depends on the batch_size (mine was 4, the paper uses 128 - obv my model wasn't trained long enough). Furthermore, I tried to train class conditional CIFAR10 model. I found out that conditioning on class lowers the FID significantly with the same number of training steps - probably needs to be trained for much longer.

Aug 21 '23 07:08 akrlowicz

How to generate the reference dataset npz files?

Oct 08 '23 08:10 bigwahaha

Why is it that I used the same operation as you again, but the FID result is over 60, can you give me the source code of your FID calculation?

The sample.py file got the [0,256] npz file, which I then reduced to the same [-1,1] file as data_npz/cifar10.test.npz using images = (images.astype(np.float32)/ 127.5) - 1.0. Then I calculate their mean and variance with incention network and finally with FID formula, can you tell me what went wrong? Thank you very much!

How to get data_npz/cifar10.test.npz?

Oct 08 '23 09:10 bigwahaha

Why is it that I used the same operation as you again, but the FID result is over 60, can you give me the source code of your FID calculation?

The sample.py file got the [0,256] npz file, which I then reduced to the same [-1,1] file as data_npz/cifar10.test.npz using images = (images.astype(np.float32)/ 127.5) - 1.0. Then I calculate their mean and variance with incention network and finally with FID formula, can you tell me what went wrong? Thank you very much!

How to get data_npz/cifar10.test.npz?

https://github.com/mseitzer/pytorch-fid/tree/master pip install pytorch-fid python -m pytorch_fid --save-stats path/to/dataset path/to/outputfile You can use pytorch-fid to get the cifar10.test.npz from path/to/cifar10_test.

Mar 12 '24 14:03 zccz6

Was anyone able to figure out how to get the same results as the paper for this FID benchmark? @z562 any updates?

May 22 '24 06:05 adistomar

improved-diffusion improved-diffusion copied to clipboard

Can't get the same FID as the checkpoint on cifar-10

improved-diffusion
improved-diffusion copied to clipboard