latent-diffusion How to reproduce result of FID 3.60 over LDM-4-G on ImageNet?

How to reproduce result of FID 3.60 over LDM-4-G on ImageNet?

Open ThisisBillhe opened this issue 1 year ago • 4 comments

Should I use cin256-v2.yaml for LDM-4?
The paper only mentioned scale=1.5 and step=250 but didn't mention the eta for this result. I tried using eta=0 (IS=115, FID=5.04) and eta=1 (IS=157, FID=4.65). What eta should I use to reproduce a result of FID 3.60?

Apr 12 '23 12:04 ThisisBillhe

Yes
I successfully reproduce the result by uniformally generate 50 images from each class. Results is shown below(IS by torch-fidelity, others by guided_diffusion evaluation code. The paper claims that FID calculated by both tools is almost coincide):

cfg with step=250, scale=1.5	IS↑	FID↓	sFID↓	Prec.↑	Recall↑
Paper reported in table 10	247.67±5.59	3.60	-	87%	48%
eta=0	205.55±5.27	3.31	5.10	82.95%	53.57%
eta=1	249.59±3.30	3.54	5.10	87.15%	48.50%

Thus I guess the eta used in paper is 1.

Jul 14 '23 06:07 Jiang-Stan

Yes

I successfully reproduce the result by uniformally generate 50 images from each class. Results is shown below(IS by torch-fidelity, others by guided_diffusion evaluation code. The paper claims that FID calculated by both tools is almost coincide):

cfg with step=250, scale=1.5 IS↑ FID↓ sFID↓ Prec.↑ Recall↑ Paper reported in table 10 247.67±5.59 3.60 - 87% 48% eta=0 205.55±5.27 3.31 5.10 82.95% 53.57% eta=1 249.59±3.30 3.54 5.10 87.15% 48.50% Thus I guess the eta used in paper is 1.

Hi. Thanks for your reply. I have another question. Did you use the reference batch provided by guided diffusion? Does this influence the FID results compared to evaluating with the entire imagenet training dataset?

Nov 15 '23 03:11 ThisisBillhe

latent-diffusion latent-diffusion copied to clipboard

How to reproduce result of FID 3.60 over LDM-4-G on ImageNet?

latent-diffusion
latent-diffusion copied to clipboard