diffae
diffae copied to clipboard
EMA model lagging behind
Thanks so much for awesome work! After around 1 Million images my model samples already look pretty good, but the ema samples are still very noisy (FID of 450 vs. 76). Does this make sense? I thought in general the ema model is supposed to have better and more stable results.
It makes sense. One million images are only ~8000 iterations with 128 batch size. This still leaves traces of initial weights on the EMA model, not long enough for it to reach a sensible region of weights.
could this be related to fp16 precision training?