Giyeong Oh comments

Results 40 comments of


                                            Giyeong Oh

Multi GPU on windows

Can you attach your environments? Number of GPUs, configuration of accelerate, installed python libraries, training configuration, script for running one of sd-scripts, etc.. If you provide as detail as, you...

DeepSpeed Accelerator can NOT save optimizer state

By doing above problem, I met another error. `TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object` Same Environment, but - ~training SDXL network~ When I changed smaller dataset, It works on sdxl network...

cache_text_encoder_outputs.py raises AttributeError: 'Namespace' object has no attribute 'deepspeed'

It is because `cache_text_encoder_outputs.py` does not prepare deepspeed config not like `train_.py` you can add ad-hoc for this 1) from library import deepspeed_utils 2) in line between 174-178, ``` train_util.add_sd_models_arguments(parser)...

train_network.py: error: unrecognized arguments: --deepspeed --zero_stage=2 --offload_optimizer_device=cpu

Did you update this repo? try script after `git pull`.

Deepspeed checkpoints are completely undertrained

First, thanks to report issue. Is there similar phenomena on different dataset?

Deepspeed checkpoints are completely undertrained

> @BootsofLagrangian yes, on all datasets when using LION optimizer. I'm not sure, maybe LION optimizer should not work as good as Adam's optimizers with Deepspeed... But it doesn't break...

How do SD's text_tokenizer and Unet work when the input prompt is too long?

First, U-Net can consume batch of output of text-encoder like [n, **77**, 768]. So, training scripts utilize this property to extend length of tokens 75, 150, 225, and so on....

Support for multiple captions in one file

If a user wishes to utilize multiple captions, derived from raw data, a tagger, or a Vision-Language Model (VLM), the script could handle this through an alternative format or file....

Model diverges using deepspeed fp16 mixed-precision training

> @BootsofLagrangian Do you have any idea what might be causing this problem? Interesting features. DeepSpeed upcasts precision to operate for optimizers. It might be one of the reason, but...

Model diverges using deepspeed fp16 mixed-precision training

@jihnenglin I saw loss divergence under some unknown conditions. But I still can not found the reason why model divergence