stable-diffusion-webui-forge Patching LoRAs for KModel: 99%|████████████████████████████████████████████████████▎| 300/304 [01:40<00:01, 2.36it/s]

And crashes.

Aug 18 '24 04:08 Kr01iKs

same but for me it crashes at 240/304

Aug 18 '24 04:08 moudahaddad14

update and try again

Aug 18 '24 06:08 lllyasviel

update and gives this error while using flux1-dev-bnb-nf4-v2.safetensors: Patching LoRAs for KModel: 59%|█████▉ | 179/304 [00:02<00:02, 48.13it/s] ERROR lora diffusion_model.double_blocks.17.txt_mlp.2.weight CUDA out of memory. Tried to allocate 144.00 MiB. GPU Patching LoRA weights failed. Retrying by offload models. Patching LoRAs for KModel: 100%|██████████| 304/304 [00:05<00:00, 51.52it/s] if instead i use flux1-dev-fp8.safetensors, the lora load fine

Aug 18 '24 06:08 killerciao

@killerciao does this give you image eventually? if it gives then i will just use this method

Aug 18 '24 06:08 lllyasviel

@killerciao does this give you image eventually? if it gives then i will just use this method

yes with flux1-dev-bnb-nf4-v2.safetensors after the errors,the image is generated,however without the effects of the lora

Aug 18 '24 06:08 killerciao

Is there a reason why this process is done at each generation? One time would be okay, but it does it each time after I click generate, making Loras pretty unusable for me.

Aug 18 '24 08:08 Dampfinchen

Still crashes.

Patching LoRAs for KModel:  44%|███████████████████████▎                             | 134/304 [00:03<00:03, 44.72it/s] ERROR lora diffusion_model.double_blocks.13.txt_mod.lin.weight Allocation on device
Patching LoRA weights failed. Retrying by offloading models.
Press any Key to Continue . . .

When this happens, other applications may crash or glitch. If 'Enabled for UNet (always maximize offload)' is checked, OOM occurs with RAM instead of VRAM. (I have 32 GB of RAM)

Aug 18 '24 09:08 iqddd

I have been using Forge to generate Flux (NF4 version) images, but when I use LoRAs in the prompt my computer freezes. This is the error:

Patching LoRAs for KModel: 52% 918 Patching LoRA weights failed. Retrying by offloading models. Patching LoRAs for KModel: 92% | 280/304 [00:23408:02, 9.31it/s] ERROR Lora diffusion_model.single_blocks. weight Allocation on device Patching LoRA weights failed. Retrying by offloading models.

The computer freezes after these lines. I can’t even take a screenshot, here's a photo: https://imgur.com/a/A10HvRt

The Lora is around 160mb

I have a 3060 12gb and 32gb ram.

Aug 18 '24 19:08 evelryu

Same issue for me, can't use any Loras where it counts up to 304 (those that only count up to 76 work great though, fast patching too). Best I could get to was 270/304, then it stops counting, "Connection timeout" popup on the browser, then Press any key to continue ...

For the time being, does anyone know how to tell in advance if a lora would count up to 76 and not 304, so that we know which is useable rn? File size's not an indicator, had a 19mb lora count up to 304, yet a 51mb lora count up to 76 ...

Ps: Unfortunately, still the same problem after today (19/8) update. Tried with flux1-dev-fp8 though and patching was successful, so it's only an issue with flux1-dev-bnb-nf4-v2

Aug 18 '24 21:08 Vinfamy-New

Same issue for me,

Aug 19 '24 02:08 AcademiaSD

I also can't use those lora which count to 304(flux1-dev-bnb-nf4-v2)

Aug 19 '24 16:08 YNWALALALA

+1 on this issue - crashes Remote Desktop session, and freezes the PC for me too.

Running on a 4090 with 24gb vram

Aug 19 '24 17:08 JuiceWill

I found on civitai small, 18 MB Flux lora. If I start with that lora first, patching fails at around step 278-288 out of 304 (same as always no matter what size of that lora ). But image being generated and lora actually works - I use face lora that base checkpoint doesn't know so I can be sure it worked. THEN I can generate 2nd image with other, large, even 800 MB lora. Suddenly even lora patching works fine all the way down to last step 304. and everything works as it should. So it's not completely dead :)

I've try your method, but with my 2060 and 12GB of vram, the patching arrive at 250 of 304, and after it crash

Aug 20 '24 09:08 Samael-1976

The latest updates still don't fix the process. On Windows 11, 32 RAM and 16 VRAM it crashes due to OOM about midway through the process. Is there any way to reduce memory and video memory consumption? Or maybe there is a way to pre-convert LORA to the desired format on a more memory-capable system?

Aug 20 '24 09:08 iqddd

18MB Lora is basically to count to 304. I can now use Loras which is 36530kb, 47489kb, 51142kb. These loras can be counted normally.（Because they are only about 100 counts） The larger Lora also has problems with not being able to count to 304.

Aug 20 '24 14:08 YNWALALALA

304 steps is where i get the issues, regardless of mb size, even on 18mb

Aug 20 '24 20:08 markdee3

I found the solution: under "Diffusion in Low Bits" (center-top of your screen), change it to "Automatic (fp16 LoRA)". Patching Loras will get to 304/304 (100%) instantly! And yes, the lora's will then work, even with flux1-dev-bnb-nf4-v2 and flux1-schnell-bnb-nf4.

How is that not the default setting!?

Aug 22 '24 18:08 Vinfamy-New

I found the solution: under "Diffusion in Low Bits" (center-top of your screen), change it to "Automatic (fp16 LoRA)". Patching Loras will get to 304/304 (100%) instantly! And yes, the lora's will then work, even with flux1-dev-bnb-nf4-v2 and flux1-schnell-bnb-nf4.

How is that not the default setting!?

THIS IS THE SOLUTION

Aug 24 '24 15:08 Gravityhorse