ComfyUI 0.3.76 Lora causes huge slowdown and vram allocation low

Custom Node Testing

[x] I have tried disabling custom nodes and the issue persists (see how to disable custom nodes if you need help)

Your question

In 0.3.76 using a Lora causes huge slowdown

vram allocation is also low in general (large mb buffer resevered)

Issues are especially prevalent with z-image-turbo

Logs

Other

No response

Dec 02 '25 17:12 NulliferBones

Are you using the bf16 weights?

Dec 02 '25 18:12 comfyanonymous

Are you using the bf16 weights?

Using the BF16 and normal fp8 weights almost negates the speed loss, thanks. (It's kj scaled causing the huge speed loss)

However when any lora are enabled with any model vram allocation is off, it adds a huge MB buffer reserved instead of actually using the vram, which also causes overflow into low vram patches.

Also with this update if I try to use something like technically color z it just ruins the image quality.

Image with and without

Dec 02 '25 18:12 NulliferBones

How big is your reserved buffer and what is you GPU. Can you post the load stats line?

Dec 02 '25 21:12 rattus128

The update made loras apply correctly on z image, before that most of the loras weights were getting skipped so the loras were much weaker than they should be.

Dec 02 '25 22:12 comfyanonymous

How big is your reserved buffer and what is you GPU. Can you post the load stats line?

Buffer is 1.6gb with a lora added, making me only use about 3.6gb of vram total. Without lora buffer is roughly 100mb making me use about 5gb vram total.

GPU is Pascal. Before 0.3.76/75 this issue didn't exist.

Dec 02 '25 23:12 NulliferBones

The update made loras apply correctly on z image, before that most of the loras weights were getting skipped so the loras were much weaker than they should be.

Yeah but now it's just completely destroying the image (like the image i posted) for me if I don't use a resolution like 1088x1440

Dec 02 '25 23:12 NulliferBones

@comfyanonymous The VRAM allocation issue also occurs with Flux2 LORAs. When I use a Lora, the reserved buffer is huge with a lot of free VRAM, and it's causing significant speed degradation.

With Lora (RTX 4090 19.92s/it, 12636.00 MB buffer reserved, lowvram patches: 152):

Without lora (12.20s/it, 972.00 MB buffer reserved, lowvram patches: 0):

I'm using core nodes to load the weights and lora:

To ensure that the issue was not related to my installation, I created a new virtual environment and reinstalled all dependencies before testing.

Dec 03 '25 00:12 VandersonQk

Use weight_dtype default.

Dec 03 '25 00:12 comfyanonymous

Same result:

Dec 03 '25 00:12 VandersonQk

ok @VandersonQk I think I see your problem, and Im working on it. Im going to track this one over here which im pretty sure is the same report:

https://github.com/comfyanonymous/ComfyUI/issues/11058

Dec 03 '25 02:12 rattus128

The vram allocation bug is happening on all models for me currently (when using a lora)

Dec 03 '25 02:12 NulliferBones