LlamaGen icon indicating copy to clipboard operation
LlamaGen copied to clipboard

FID results of GPT-L and GPT-1B on 256*256 images

Open LutingWang opened this issue 1 year ago • 5 comments

Hi, thanks for the excellent work. I'm trying to reproduce the results on 256*256 images. The VQGAN model is reproduced successively, achieving $2.10$ rFID. However, the AR part experiences a significant performance gap. More specifically, I use 8 A100-80G GPU to run the following scripts

bash scripts/autoregressive/train_c2i.sh --cloud-save-path xxx --code-path xxx --gpt-model GPT-L --epochs 50
bash scripts/autoregressive/train_c2i.sh --cloud-save-path xxx --code-path xxx --gpt-model GPT-1B --epochs 50

The training results are as follows

Model Final Loss FID Expected FID
GPT-L 7.86 4.62 4.22
GPT-1B 7.33 4.13 3.09

Is the final loss reasonable? Do you have any idea what the reason might be?

Thanks!

LutingWang avatar Jul 19 '24 07:07 LutingWang

Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.

PeizeSun avatar Jul 23 '24 06:07 PeizeSun

Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.

Sorry for the mistake. I was trying to emphasize that the image resolution is not 384x384, but I mistakenly wrote 224.

LutingWang avatar Jul 23 '24 12:07 LutingWang

Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.

Hi. Thank you for this awesome repo. I have the same issue with the original code that the loss ends around 7.3 after 300 epochs. IMG_0379

msed-Ebrahimi avatar Jul 24 '24 03:07 msed-Ebrahimi

Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.

Hi. Thank you for this awesome repo. I have the same issue with the original code that the loss ends around 7.3 after 300 epochs. IMG_0379

Hello, I would like to ask you what machine and how much training time are needed to get the results shown in the figure.

fupiao1998 avatar Dec 19 '24 13:12 fupiao1998

Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.

Hi. Thank you for this awesome repo. I have the same issue with the original code that the loss ends around 7.3 after 300 epochs. IMG_0379

Hello, I would like to ask what the model size for this loss curve is and the FID you calculate. Thanks!

haha-yuki-haha avatar Jan 13 '25 03:01 haha-yuki-haha