latent-diffusion FID on coco for text to image generation

I was trying to evaluate the text to image generation results on coco dataset. I computed the FID on 259 samples randomly drawn from the coco validation dataset and got a FID score of 134, which is significantly higher than the 12.63 reported in the paper. I guess maybe I am doing something wrong in evaluation and hoping someone here can point it out. I just ran the pertained model with the default configuration and computed the FID using this repo https://github.com/mseitzer/pytorch-fid

Jun 22 '22 01:06 zengxianyu

In my personal experience, I had to use a significant amount of images for the FID to be accurate. Something like 3-10k should be enough.

Jul 11 '22 17:07 dillon101001

Have you reproduced the FID results from MSCOCO? I used the 30000 samples mentioned in the paper to evaluate, and the FID is around 40.

Jan 11 '23 05:01 ttliu-kiwi

Have you reproduced the FID results from MSCOCO? I used the 30000 samples mentioned in the paper to evaluate, and the FID is around 40.

Same，I sample 30000 captions in the COCO 2014 val dataset，the FID is 33.

Jan 11 '23 12:01 lhy101

I was trying to evaluate the text to image generation results on coco dataset. I computed the FID on 259 samples randomly drawn from the coco validation dataset and got a FID score of 134, which is significantly higher than the 12.63 reported in the paper. I guess maybe I am doing something wrong in evaluation and hoping someone here can point it out. I just ran the pertained model with the default configuration and computed the FID using this repo https://github.com/mseitzer/pytorch-fid

I also want to evaluate the text to image generation results. However, I have not found the config file like lusn for it as shown in the following figure. Could you please share your file? Thank you very much.

Mar 09 '23 12:03 Wujie-nju

@ttliu-kiwi , @lhy101 , can you please share your evalutation test script? thank you very much!

May 18 '23 08:05 jiayisunx

Have you reproduced the FID results from MSCOCO? I used the 30000 samples mentioned in the paper to evaluate, and the FID is around 40.

Same，I sample 30000 captions in the COCO 2014 val dataset，the FID is 33.

Hi, have you solved this issue? I also got similar results :(

Jul 21 '23 03:07 stdKonjac

Have you reproduced the FID results from MSCOCO? I used the 30000 samples mentioned in the paper to evaluate, and the FID is around 40.

Same，I sample 30000 captions in the COCO 2014 val dataset，the FID is 33.

Hi, have you solved this issue? I also got similar results :(

Not yet :(

Jul 31 '23 10:07 lhy101