improved-diffusion
improved-diffusion copied to clipboard
EMA sampling produces only noise
Hi,
I am training on a dataset of 500 images (I know it's not a lot), with diffusion steps=4000 and basic parameters proposed in the read me.
Using the first model checkpoint at 10K steps, the sampling produces only noise. (with the same diffusion steps or 250 or 1000, it doesn't change anything). However, the training loss is aloready at 0.03.
Do you have an idea of what is happening here ?
An automatic sampling every X training steps, would be greatly appreciated to avoid this king of surprising issues.
Thanks a lot !
Where did you find the checkpoint? Is it created during training or once it finished training? There should be a /temp directory, but even after 45000 steps it did not create any (or in which directory did you find it?)
I found the checkpoint in the tmp/openai-XXX/ corresponding to the correct train. I tried sampling from the 10000 and 20000 steps models.
Actually it's seems it's with the ema version of the model for sampling that produces only noisy results.
Did you obtain good samples when using the non-ema models?
So basically: EMA is the exponential moving average of the parameters, then it makes sense that the first few thousand EMA models just produce noise. The models called models just have the most frequently updated copy of parameters which is they show results at an earlier tilmestep.
i get the same problem,Can you explain the reason,thank you
As I said look up EMA, the parameters of the exponential moving average will change very slowly compared to the „model“ parameters. Fix: train (far) longer. Or sample from model with „model“ weighs instead of ema
正如我所说的查看 EMA,与“模型”参数相比,指数移动平均线的参数变化非常缓慢。修复:训练(远)更长的时间。或者从具有“模型”重量而不是 ema 的模型中采样
I have tried both model and EMA model, and the results of both are noise points,There is also an opt model. I don't know what its function is。
opt is not a model, it is the optimizer state if you want to restart training