img2img-turbo icon indicating copy to clipboard operation
img2img-turbo copied to clipboard

add mixed precision training support for cyclegan turbo

Open King-HAW opened this issue 10 months ago • 4 comments

Hi Gaurav,

I've added the mixed precision support for training cyclegan turbo, so that the unpaired training could work on a 24G NVIDIA GPU.

King-HAW avatar Apr 11 '24 20:04 King-HAW

Tried to run this, it fails.

Loading model from: /home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Steps:   0%|                                                                                                                                                                      | 0/25000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 410, in <module>
    main(args)
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 213, in main
    accelerator.clip_grad_norm_(params_gen, args.max_grad_norm)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2157, in clip_grad_norm_
    self.unscale_gradients()
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2107, in unscale_gradients
    self.scaler.unscale_(opt)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 284, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 212, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

seerdecker avatar Apr 17 '24 20:04 seerdecker

Tried to run this, it fails.

Loading model from: /home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/lpips/weights/v0.1/vgg.pth
Steps:   0%|                                                                                                                                                                      | 0/25000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 410, in <module>
    main(args)
  File "/home/ubuntu/repos/img2img-turbo/src/train_cyclegan_turbo.py", line 213, in main
    accelerator.clip_grad_norm_(params_gen, args.max_grad_norm)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2157, in clip_grad_norm_
    self.unscale_gradients()
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/accelerate/accelerator.py", line 2107, in unscale_gradients
    self.scaler.unscale_(opt)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 284, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/home/ubuntu/miniconda3/envs/img2img-turbo/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 212, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

Hi, please try to set the mixed precision to bf16, that should work. My local GPU is NVIDIA GeForce RTX 4090 24GB.

King-HAW avatar Apr 17 '24 22:04 King-HAW

@King-HAW Hi , thanks for sharing I meet a problem as follows when i use the mixed precision

ValueError: Query/Key/Value should either all have the same dtype, or (in the quantized case) Key/Value should have dtype torch.int32

query.dtype: torch.float32 key.dtype : torch.bfloat16 value.dtype: torch.bfloat16

But I solve this problem when i I run accelerate without --enable_xformers_memory_efficient_attention by following https://github.com/huggingface/accelerate/issues/2182 Do you meet the same problem before , and how do you solve this problem

ACupoFruiTea avatar Jul 15 '24 11:07 ACupoFruiTea

Hi @King-HAW, I have tried your fork, but the out of memory has still maintained (I use 3090 with 24GB VRAM). Could you please explain to me how can I fix this error?

Thank you so much.

nldhuyen0047 avatar Aug 28 '24 10:08 nldhuyen0047