Giyeong Oh

Results 40 comments of Giyeong Oh

Can you attach your environments? Number of GPUs, configuration of accelerate, installed python libraries, training configuration, script for running one of sd-scripts, etc.. If you provide as detail as, you...

By doing above problem, I met another error. `TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object` Same Environment, but - ~training SDXL network~ When I changed smaller dataset, It works on sdxl network...

It is because `cache_text_encoder_outputs.py` does not prepare deepspeed config not like `train_.py` you can add ad-hoc for this 1) from library import deepspeed_utils 2) in line between 174-178, ``` train_util.add_sd_models_arguments(parser)...

First, thanks to report issue. Is there similar phenomena on different dataset?

> @BootsofLagrangian yes, on all datasets when using LION optimizer. I'm not sure, maybe LION optimizer should not work as good as Adam's optimizers with Deepspeed... But it doesn't break...

First, U-Net can consume batch of output of text-encoder like [n, **77**, 768]. So, training scripts utilize this property to extend length of tokens 75, 150, 225, and so on....

If a user wishes to utilize multiple captions, derived from raw data, a tagger, or a Vision-Language Model (VLM), the script could handle this through an alternative format or file....

> @BootsofLagrangian Do you have any idea what might be causing this problem? Interesting features. DeepSpeed upcasts precision to operate for optimizers. It might be one of the reason, but...

@jihnenglin I saw loss divergence under some unknown conditions. But I still can not found the reason why model divergence