denoising-diffusion-pytorch icon indicating copy to clipboard operation
denoising-diffusion-pytorch copied to clipboard

training on own dataset but start getting black images after "sample-10"

Open twentyfiveYang opened this issue 2 years ago • 2 comments

Thank you sharing the wonderful code! I was using my own dataset for training but start getting black images after 9000 steps. Anyone knows how to fix this situation? Here is my training script and results:

from denoising_diffusion_pytorch import Unet, GaussianDiffusion, Trainer

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).cuda()

diffusion = GaussianDiffusion(
    model,
    image_size = 64,
    timesteps = 1000,           # number of steps
    sampling_timesteps = 250,   # number of sampling timesteps (using ddim for faster inference [see citation for ddim paper])
    loss_type = 'l1'            # L1 or L2
).cuda()

trainer = Trainer(
    diffusion,
    'G:/code3/DDPM-main/data/train/',
    train_batch_size = 32,
    train_lr = 8e-5,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    amp = True                        # turn on mixed precision
)

trainer.train()

图片

twentyfiveYang avatar Oct 04 '22 09:10 twentyfiveYang

Do you see nan loss during the training? Based on what I've tested, if you turn off amp=True, it resolves nan loss under certain cases (with the cost of being slower.)

ruochiz avatar Oct 04 '22 18:10 ruochiz

Do you see nan loss during the training? Based on what I've tested, if you turn off amp=True, it resolves nan loss under certain cases (with the cost of being slower.)

thank you very much for your reply. It solved my problem

twentyfiveYang avatar Oct 05 '22 05:10 twentyfiveYang