A problem when training
I met a problem when training on my single RTX 4090. The predicted target occurs some black sub-images when training after 36k steps. The learning rate is set to 5e-5 and batch size is 64. Can you give me some advice?
predicted target
decode target
prime target
The training log is here: 2024-08-20 20:54:35,177 - train_ldm.py - autoencoder: pretrained_path: assets/stable-diffusion/autoencoder_kl.pth ckpt_root: workdir/flickr192_large/noise_pred_20240820_80004/ckpts config_name: flickr192_large dataset: embed_dim: 1024 grid_size: 12 name: flickr path: ./dataset/scenery/train_ori/ resolution: 192 hparams: noise_pred_20240820_80004 lr_scheduler: name: customized warmup_steps: 20000 mixed_precision: fp16 nnet: depth: 20 embed_dim: 1024 img_size: 24 in_chans: 4 mlp_ratio: 4 mlp_time_embed: false name: uvit num_classes: 1001 num_heads: 16 patch_size: 2 qkv_bias: false use_checkpoint: true optimizer: betas: !!python/tuple
- 0.99
- 0.99 lr: 0.0002 name: adamw weight_decay: 0.03 pred: noise_pred sample: algorithm: dpm_solver cfg: true mini_batch_size: 50 n_samples: 50000 path: '' sample_steps: 50 scale: 0.4 sample_dir: workdir/flickr192_large/noise_pred_20240820_80004/samples seed: 1234 train: batch_size: 64 eval_interval: 2000 log_interval: 10 mode: cond n_steps: 320000 save_interval: 4000 workdir: workdir/flickr192_large/noise_pred_20240820_80004 z_shape: !!python/tuple
- 4
- 24
- 24
Try longer iterations and larger batch sizes.