deGENERATIVE-SQUAD
deGENERATIVE-SQUAD
Found this PR finally. I hope you don’t mind if I share something relevant to the topic. A guy from an one forum made this wavelet implementation based on many...
@WinodePino >I am trying to test this, but somehow my caching latents gets stuck at 0%, any idea why this could be? I’m not using the cache at all -...
@WinodePino >Thanks, it is working now, but it doesnt seem to learn anything. If the iteration is proceeding at a normal speed, then obviously the problem lies in the too...
> #1866 I tried both the Lora algorithm and the Glora+Dora algorithm on SDXL - no noticeable decrease in VRAM usage. Speeds are the same with and without fused_backward_pass also....
> Try it without my propesed changes, apparently it was already working if you set --fused_back_pass as an optimizer arg Tested it earlier: 1. back_pass as an optimizer argument is...
Problem solved in new version of https://github.com/LoganBooker/prodigy-plus-schedule-free According issue thread https://github.com/LoganBooker/prodigy-plus-schedule-free/issues/7 now it works with full finetunes and loras (the problem was the lack of Fused support for LoRa in...
@FurkanGozukara > what optimizer parameters are required? Basically: d0, eps, d_coef, use_stableadamw, and stochastic_rounding Optionally: use_bias_correction, factored, weight_decay, split_groups (for learning rate splitting between U-Net and TE). > i did...
Show your full config and describe dataset characteristics. At what step do nan start? Usually, NaN indicates latent issues, for example, if you are using full_fp16, then some optimizers and...
@kohya-ss ``` Traceback (most recent call last): File "K:\sd-scripts\sd-scripts-sd3\sdxl_train_network.py", line 229, in trainer.train(args) File "K:\sd-scripts\sd-scripts-sd3\train_network.py", line 1403, in train loss = self.process_batch( ^^^^^^^^^^^^^^^^^^^ File "K:\sd-scripts\sd-scripts-sd3\train_network.py", line 463, in process_batch huber_c...
@kohya-ss This definitely doesn’t depend on the optimizer being used - the same issue occurs with Adafactor or any other optimizer. d_limiter is not the root of the problem, it...