denoising-diffusion-pytorch icon indicating copy to clipboard operation
denoising-diffusion-pytorch copied to clipboard

Sharing answer from the question "How many iterations do I need?"

Open SeungyounShin opened this issue 3 years ago • 4 comments

[Training Code]

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).cuda()

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,           # number of steps
    sampling_timesteps = 250,   # number of sampling timesteps 
    loss_type = 'l1'            # L1 or L2
).cuda()

trainer = Trainer(
    diffusion,
    '/mnt/prj/seungyoun/dataset/flowers',
    train_batch_size = 128,
    train_lr = 1e-4,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    amp = False                       # turn on mixed precision
)

trainer.train()

amp=True hinders training.

[Training progress] image

SeungyounShin avatar Nov 09 '22 06:11 SeungyounShin

Thank you so much for the information. Have you check out the quantitative metrics during the training? I am wondering if this would be possible: the quantitative results converge while the sampling quality still actually increases. If so, we can not deduce the training epochs by these quantitative results. Just enlarge the training loop as long as you can.

pengzhangzhi avatar Nov 22 '22 08:11 pengzhangzhi

Hello, amazing! Just wanna know that the "iteration" refers to "train_num_steps" or "timesteps".

dannalily avatar Nov 25 '22 15:11 dannalily

I think it is “train_num_steps“.

pengzhangzhi avatar Nov 26 '22 02:11 pengzhangzhi

I think it is “train_num_steps“.

Hello, I would like to know what train_um_steps means and its relationship with epoch. Thank you for your answer

lwtgithublwt avatar May 21 '24 01:05 lwtgithublwt