text2image icon indicating copy to clipboard operation
text2image copied to clipboard

FID not matching for COCO dataset

Open Mishra1995 opened this issue 2 years ago • 1 comments

Hi,

Thanks for providing the code for the paper!

I tried to reproduce the FID score on the COCO dataset by generating images for the validation dataset as reported in the paper using the generator for 120th epoch (netG_120.pth) and text encoder for 595th epoch. I used the Pytorch implementation of the FID score and it gives me around 121 FID score as opposed to 19 reported.

I had resized the original COCO images to 256x256 resolution to have a consistent image size but the score is still high.

Even the generation is weird for sample sentences.

For eg.

The image attached has caption "A close up of a boat on a field with a cloudy sky". This caption has been taken from the paper but the generation using the final generator model and text encoder model is nowhere near presented in the paper.

Any suggestions from your side as to what has to be done? img_0

Also can you please mention the difference between main.py and main_finetune.py? There is not much of a difference in both these scripts.

Mishra1995 avatar Sep 05 '22 19:09 Mishra1995

I have tested the valid dataset on COCO2014, and the FID score is 18.18. However, the visualization results are not ideal.

QinSY123 avatar Nov 18 '22 01:11 QinSY123