Mikael Hirki comments

Results 13 comments of


                                            Mikael Hirki

Strange issues using local model

It's probably running on CPU instead of GPU. Also, RTX 3080 doesn't have enough VRAM to train a Flux LoRA without quantization. Most RTX 3080 cards are either 10 GB...

Strange issues using local model

So this is a multi-GPU machine? That could also be causing these issues if it's trying to use both GPUs and the slower GPU is holding back everything. You can...

Strange issues using local model

And since you don't always have internet access, you should probably run `wandb offline`. Or alternatively, just switch to tensorboard which works locally.

Strange issues using local model

One CPU core at 100% is normal. GPU should be busy when pre-computing the text embeds. There's probably something wrong with your system specifically. SimpleTuner is using CUDA 12.4 so...

Load Flux base model from quantized?

@sneccc Just out of curiosity, is ComfyUI loading the model in lowvram mode, i.e. does it say "loaded in lowvram mode" in the console?

Load Flux base model from quantized?

> > @sneccc Just out of curiosity, is ComfyUI loading the model in lowvram mode, i.e. does it say "loaded in lowvram mode" in the console? > > idk what...

AttributeError: 'DistributedDataParallel' object has no attribute 'dtype'

You seem to be running the release branch. The fix is on the main branch.

Sometimes training process dies with Signals.SIGKILL 9

Have a look at the output `dmesg` command. If it's saying something about OOM killer, you're running out of RAM.

Sometimes training process dies with Signals.SIGKILL 9

Commit 3310c672d4c7b688373026a8faf0eba2ea267709 may have increased RAM usage if the text encoders remain in RAM during training. One idea would be to partially revert this commit so that text encoders are...

Sometimes training process dies with Signals.SIGKILL 9

System RAM usage should now be several gigabytes lower now that pull request #694 has been merged.