Fooocus
Fooocus copied to clipboard
Please fix AMD GPU mem allocation issue.
There seems to be a memory loop issue causing the application to crash when trying to render images. This is a major issue and would like to know if it is going to be addressed or not. Just curious where it is on the kanban board :D
+1
+1
+1
+1
+1
+1
+1
politely +1
I appreciate being able to use your software; and I would be happy to provide any logs or exceptions needed to help the Devs on this project.
I have 32GB of memory, an 8GB Radeon 6650, and an AMD 7950. I have tried with switches such as --lowvram which yields an exception stating I did not compile for CUDA cores; and I have tried some of the other suggested fixes which all appear to result in the system first allocating 100% of available GPU memory and then not using it while crashing when it needed roughly 65MB of GPU memory =(
Let me know if I can provide any other details.
+1 highly appreciating your work.
A week ago I have installed Fooocus on a Manjaro linux on a laptop (AMD Ryzen 6900HS 32GB RAM, AMD 6800S 8GB VRAM). Everything run almost without HIP Error using any SDXL model I have tested, with any option and with up to 4 LORAs in advanced mode. So far, I have only been able to get a memory error with an "Upscale (2x)".
I have installed Fooocus cloning the Github repo :
git clone https://github.com/lllyasviel/Fooocus.git
Created python environment :
python -m venv venv
source venv/bin/activate
Upgraded pip to the latest version (probably not necessary) :
pip install --upgrade pip
Installed PyTorch nightly with ROCm 5.7 (see "Install Pytorch" paragraph on https://pytorch.org/)
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7
installed the requirements :
pip install -r requirements_versions.txt
created a file "webui.sh" with the content below :
#!/bin/sh
source venv/bin/activate
HSA_OVERRIDE_GFX_VERSION=10.3.0 python entry_with_update.py --preset realistic
made it executable and run it :
chmod +x webui.sh
./webui.sh
The HSA_OVERRIDE_GFX_VERSION
seems to be the most important configuration option. If I remember correctly 10.3.0
should work with RDNA2 cards while 11.0.0
with RDNA3 cards.
However yesterday, after upgrading Fooocus :
git pull
I had this error :
ERROR: Cannot install -r requirements_versions.txt (line 1), -r requirements_versions.txt (line 12), -r requirements_versions.txt (line 14), -r
requirements_versions.txt (line 16), -r requirements_versions.txt (line 18), -r requirements_versions.txt (line 3), -r requirements_versions.txt (line 5),
-r requirements_versions.txt (line 8) and numpy==1.23.5 because these package versions have conflicting dependencies.
The conflict is caused by:
The user requested numpy==1.23.5
torchsde 0.2.5 depends on numpy>=1.19.*; python_version >= "3.7"
transformers 4.30.2 depends on numpy>=1.17
accelerate 0.21.0 depends on numpy>=1.17
scipy 1.9.3 depends on numpy<1.26.0 and >=1.18.5
pytorch-lightning 1.9.4 depends on numpy>=1.17.2
gradio 3.41.2 depends on numpy~=1.0
opencv-contrib-python 4.8.0.74 depends on numpy>=1.21.2; python_version >= "3.10"
opencv-contrib-python 4.8.0.74 depends on numpy>=1.23.5; python_version >= "3.11"
opencv-contrib-python 4.8.0.74 depends on numpy>=1.17.0; python_version >= "3.7"
opencv-contrib-python 4.8.0.74 depends on numpy>=1.17.3; python_version >= "3.8"
opencv-contrib-python 4.8.0.74 depends on numpy>=1.19.3; python_version >= "3.9"
onnxruntime 1.16.3 depends on numpy>=1.24.2
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
replacing numpy==1.23.5
by numpy==1.24.2
in requirements_versions.txt
and installing it fixes the problem and everything runs fine again, but I am not sure this is the way to do it.
"onnxruntime 1.16.3 depends on numpy>=1.24.2" means you are using python 3.11 3.10 will not have this problem
Yes, you're right, I am using Python 3.11. However when I installed Foocus a week ago I didn't have any error, I may just have been lucky.
I've finally got it to render with ver. 2.1.860 using my 6700 XT (12GB VRAM), however the VRAM is still being detected as 1024MB only, and thus very very slow renders. Task Manager and AMD Overlay shows full 12GB GPU utilisation though...
My run.bat looks like this, which includes the --attention-split code as suggested in the script when run.bat is running. Any ideas?
.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y
.\python_embeded\python.exe -m pip install torch-directml
.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --preset realistic --attention-split
pause
I've finally got it to render with ver. 2.1.860 using my 6700 XT (12GB VRAM), however the VRAM is still being detected as 1024MB only, and thus very very slow renders. Task Manager and AMD Overlay shows full 12GB GPU utilisation though...
My run.bat looks like this, which includes the --attention-split code as suggested in the script when run.bat is running. Any ideas?
.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y .\python_embeded\python.exe -m pip install torch-directml .\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --preset realistic --attention-split pause
![]()
![]()
![]()
I am experiencing the exact same issue. My GPU is an AMD 7800XT. I have done everything stated in the quoted post. Not sure if I am missing something entirely or just simply doing something wrong. Any insight would be greatly appreciated. Granted, this does not stop the program from running, it is just slower than expected.
@xjbar currently doing issue cleanup. Is this issue still present for you using the latest version of Fooocus or can it be closed?
@xjbar currently doing issue cleanup. Is this issue still present for you using the latest version of Fooocus or can it be closed?
@mashb1t - mine and @oXb3 's issue still present in latest version (2.1.865) fyi
Runing here without no problem https://gist.github.com/hqnicolas/5fbb9c37dcfc29c9a0ffe50fbcb35bdd to RX6000 use: HSA_OVERRIDE_GFX_VERSION=10.3.0
Runing here without no problem https://gist.github.com/hqnicolas/5fbb9c37dcfc29c9a0ffe50fbcb35bdd to RX6000 use: HSA_OVERRIDE_GFX_VERSION=10.3.0
@hqnicolas does everything on that URL go into run.bat?
@magicAUS you need to: clean install ubuntu 22.04 copy and paste every step manually to the terminal first you need to read the blue title that says 1 - Driver install 2 - Before Run 3 - Run it
@magicAUS you need to: clean install ubuntu 22.04 copy and paste every step manually to the terminal first you need to read the blue title that says 1 - Driver install 2 - Before Run 3 - Run it
@hqnicolas I think the OS is the differentiator to it working. @oXb3 and I are on Windows (11 Pro for me).
@magicAUS insert an extra SSD on your machine and build it
I am facing the exact same issue on Windows. I am running on an RX7800xt, with 32gb of RAM. Fooocus only recognizes 1024MB of vRam and when starting to generate the models it throws the following:
Fooocus\modules\anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at C:__w\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True) 3%|██▊ | 1/30 [00:07<03:23, 7.01s/it][W dml_heap_allocator.cc:120] DML allocator out of memory!
Any solution to it? I've been searching for one but no success so far.
Thank you!
Fooocus really was cool. Please understand I'm not insulting their work.
I have an RX 6650, and I found that SD.Next with the zluda pipeline works best for me.
Most of the memory issues seem to be a Microsoft and AMD bug, but somehow, the zluda stuff makes it work pretty well
I was getting 15 minutes with cpu per image, and now I get 1-2 minutes per image now that I'm running on GPU
I hope that helps a little.
On Thu, Jun 20, 2024, 8:07 PM Matheus Henrique de Oliveira < @.***> wrote:
I am facing the exact same issue on Windows. I am running on an RX7800xt, with 32gb of RAM. Fooocus only recognizes 1024MB of vRam and when starting to generate the models it throws the following:
Fooocus\modules\anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at C:__w\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True) 3%|██▊ | 1/30 [00:07<03:23, 7.01s/it][W dml_heap_allocator.cc:120] DML allocator out of memory!
Any solution to it? I've been searching for one but no success so far.
Thank you!
— Reply to this email directly, view it on GitHub https://github.com/lllyasviel/Fooocus/issues/1294#issuecomment-2181753097, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGXUTLLDTCTMJJ5LYPIZVVDZINVEHAVCNFSM6AAAAABANPQIUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBRG42TGMBZG4 . You are receiving this because you commented.Message ID: @.***>