vram not being offloaded on amd with wsl
Your question
I have a 24gb AMD card officially supported by AMD (7900xtx) and when trying to load bigger models like flux or hidream in q6 they should fit into 24GB but they don't. The clip gets fully loaded and the model only partially.
When observing the vram usage with AMD adrenaline i don't observe a reduction in vram when the clip is finished leading me to believe that comfyUI doesn't unload the model to RAM like it should and stacks the model on top of the already full vram leading to a partially loaded model.
Furthermore when running the clips on the CPU the q6 diffusion model fits easily on only 16gb of vram. This adds to my suspicion that something is wrong with the offload because the vram usage should be the same with clips on GPU and CPU if comfy performs a correct offload onto the system RAM.
(correct my if I misunderstood something)
System
- ryzen 7 7700X
- rx 7900XTX
- 32GB ddr5 6000mHz
- Windows 11 / WSL2
Logs
MIOPEN_FIND_MODE=2 FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" python main.py --use-flash-attention
Other
No response
This issue is being marked stale because it has not had any activity for 30 days. Reply below within 7 days if your issue still isn't solved, and it will be left open. Otherwise, the issue will be closed automatically.