vram not being offloaded on amd with wsl

Open Kademo15 opened this issue 1 year ago • 1 comments

Your question

I have a 24gb AMD card officially supported by AMD (7900xtx) and when trying to load bigger models like flux or hidream in q6 they should fit into 24GB but they don't. The clip gets fully loaded and the model only partially.

When observing the vram usage with AMD adrenaline i don't observe a reduction in vram when the clip is finished leading me to believe that comfyUI doesn't unload the model to RAM like it should and stacks the model on top of the already full vram leading to a partially loaded model.

Furthermore when running the clips on the CPU the q6 diffusion model fits easily on only 16gb of vram. This adds to my suspicion that something is wrong with the offload because the vram usage should be the same with clips on GPU and CPU if comfy performs a correct offload onto the system RAM.

(correct my if I misunderstood something)

System

ryzen 7 7700X
rx 7900XTX
32GB ddr5 6000mHz
Windows 11 / WSL2

Logs

MIOPEN_FIND_MODE=2 FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" python main.py --use-flash-attention

Other

No response

Apr 20 '25 16:04 Kademo15

This issue is being marked stale because it has not had any activity for 30 days. Reply below within 7 days if your issue still isn't solved, and it will be left open. Otherwise, the issue will be closed automatically.

May 21 '25 11:05 github-actions[bot]