Leo Jiang

Results 23 comments of Leo Jiang

Can you try regarding #9829 ? I have saved memory by implementing this :)

@riflemanl I used bf16 with the deepspeed and accelerate, it should work. Another reason is the FLUX is 12B model, it costs lots of memory

24GB is not enought, what's your hardware setting? Try to reduce the batch size and resolution

Batch size 1 with resolution 256 will cost around 40GB on my 8 GPU. I think you should try off loads to cpu

Can you try regarding #9829 ? I have saved memory by implementing this :)

@kopyl Before the modification, it just stuck with almost the same VRAM when training with one GPU. After that, it can save in a fast speed. But I didn't check.

@kopyl I think this issue is related to accelerator, my algo saves memory in the training process. But your issue is in the last saving step. Can you change `accelerator.is_main_process`...

@kopyl In the saving section I mean. add "if accelerator.is_main_process:" add `or accelerator.distributed_type == DistributedType.DEEPSPEED`, and don't forget to import DistributedType from accelerate