eg3d
eg3d copied to clipboard
FID during training is largely different from eval separately for the same checkpoint
Hi @ryanrussell @ericryanchan , Thanks for opening source this awesome method!
When I retrain the eg3d model using python train.py --outdir=~/training-runs --cfg=ffhq --data ./FFHQ_512.zip --gpus=4 --batch=16 --gamma=1 --gen_pose_cond=True
, I noticed that when evaluation during training, the FID is quite high (~95) even after 13000 kimgs:
ick 3250 kimg 13000.0 time 5d 01h 48m sec/tick 128.5 sec/kimg 32.11 maintenance 0.1 cpumem 3.94 gpumem 8.73 reserved 10.29 augment 0.000
~/training-runs/00014-ffhq-FFHQ_512-gpus4-batch16-gamma1
Evaluating metrics...
{"results": {"fid50k_full": 95.52802728177237}, "metric": "fid50k_full", "total_time": 289.52356004714966, "total_time_str": "4m 50s", "num_gpus": 4, "snapshot_pkl": "network-snapshot-013000.pkl", "timestamp": 1693301505.8255823}
tick 3251 kimg 13004.0 time 5d 01h 55m sec/tick 128.3 sec/kimg 32.08 maintenance 310.1 cpumem 3.86 gpumem 8.73 reserved 10.29 augment 0.000
tick 3252 kimg 13008.0 time 5d 01h 57m sec/tick 128.4 sec/kimg 32.09 maintenance 0.1 cpumem 3.86 gpumem 8.73 reserved 10.29 augment 0.000
However, when I evaluate the same checkpoint using single evaluation script like this: python calc_metrics.py --metrics=fid50k_full --data ./FFHQ_512.zip --network ./network-snapshot-013000.pkl
, the FID is quite low (~8):
generator features items 50000 time 32m 32s ms/item 41.20
{"results": {"fid50k_full": 8.188923527848317}, "metric": "fid50k_full", "total_time": 2188.9980747699738, "total_time_str": "36m 29s", "num_gpus": 1, "snapshot_pkl": ".\\network-snapshot-013000.pkl", "timestamp": 1693321722.8993306}
When I checked the fake image of this checkpoint, it is quite OK and nothing was wrong with it.
Could you give any comments or hints on the unnormal FID during training? Thanks in adavance