stable-diffusion-webui-forge
stable-diffusion-webui-forge copied to clipboard
FLUX - NF4 uses a lot of RAM and VRAM when a LORA is being used.
For some reason dev fp8 works better than nf4 when I use lora, Nf4 just use a lot of vram and ram making the generation speed absurdly slow.
With dev fp8 I'm getting speeds like 1.01 s/it and with NF4 i;'m getting like 29s/it.
I think its due to lora being patched and vram/ram not being released after that.