InvokeAI
InvokeAI copied to clipboard
[bug]: On linux, defaulting to CPU, can't figure out how to make it use GPU
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
Linux
GPU
cuda
VRAM
8GB
What happened?
Similar to #1763, but on Linux, not Windows.
I'm using InvokeAI on Ubuntu 20.04, installed according to these instructions. Going pretty well, but it's slow, only running on the CPU. >> Using device_type cpu But I should be able to run it on my GPU -- a GTX 1070.
I've tried starting the script with invoke.py --precision=float32 as suggested in the readme file for 10xx series GPUs, but it's still not working.
Does anybody know what I should do in order to get it using the GPU instead?
Screenshots
`>> GFPGAN Initialized
CodeFormer Initialized ESRGAN Initialized Using device_type cpu Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/model.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision Model loaded in 24.44s Setting Sampler to k_lms `
Additional context
No response
Contact Details
I was having this issue with my 2060 and updating the drivers to the most recent proprietary driver helped
I always have a lot of trouble getting my host system running against the GPU. I've found docker is a much more consistent experience.
You could do that using these instructions:
https://github.com/invoke-ai/InvokeAI/blob/main/docs/installation/040_INSTALL_DOCKER.md#running-the-container-on-your-gpu
Basically, you just install docker then run:
./docker-build/build.sh
./docker-build/run.sh
I'm having this same issue with the Using device_type cpu
as well as an error hipErrorInvalidDevice
, both of which I haven't had much luck diagnosing. Hope this gets resolved soon.
Yeah a cpu with integrated graphics will trigger it as well so it's extra problematic for me...
I have gotten stable-diffusion to work just not invokeAi since the last update. So I'm just going to keep trying till I fix it or it gets fixed.
On Fri., Dec. 16, 2022, 6:37 p.m. Ben Barber, @.***> wrote:
[image: image] https://user-images.githubusercontent.com/6320364/208213290-a14c774a-ed00-4702-b010-c0333a31cd0a.png
I'm having this same issue with the Using device_type cpu as well as an error hipErrorInvalidDevice , both of which I haven't had much luck diagnosing. Hope this gets resolved soon.
— Reply to this email directly, view it on GitHub https://github.com/invoke-ai/InvokeAI/issues/2030#issuecomment-1355879088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOETWKQHMYYZGM72EJQHWPLWNUDMNANCNFSM6AAAAAATALIGEI . You are receiving this because you commented.Message ID: @.***>
@MordesMortes Yup, i have integrated graphics too and I cannot for the life of me figure out how to make it run with nvidia, if you figure anything out please let me know! I'll do the same if I somehow stumble upon a solution :+1:
Yeah a cpu with integrated graphics will trigger it as well so it's extra problematic for me...
In case it makes a difference, I'm not using integrated graphics. My CPU is an AMD Threadripper.
updating the drivers to the most recent proprietary driver helped
Heh ... last time I tried to update GPU drivers, I bricked my system and had to reinstall.
If that's what I need to do to fix this, then I'll just let it run on CPU for now. Planning on doing a fresh install as soon as Ubuntu has an LTS with the 6.0 kernel, and I'll get the latest GPU drivers then.
I've found docker is a much more consistent experience.
Thanks -- I'll give it a try and report back whether it helped!
Those experiencing this problem,
- did you use the automated installer, or installed manually?
- does your machine have integrated graphics or no (even though you're not using that), and if that's a yes, is that integrated GPU an AMD or other?
Thanks!
@MordesMortes I see you also opened https://github.com/invoke-ai/InvokeAI/issues/2022 the other day. wondering if that might be related.
A hypothesis: if your machine has an integrated AMD GPU, the installer detects that, and installs a ROCm version of PyTorch. Then when pytorch loads, it falls back on CPU.
But that's only a hypothesis until we can test it.
- did you use the automated installer, or installed manually?
Used these instructions: https://code.mendhak.com/run-stable-diffusion-on-ubuntu/ which I think is a manual installation?
- does your machine have integrated graphics or no (even though you're not using that), and if that's a yes, is that integrated GPU an AMD or other?
Here's a screenshot of a full system summary (neofetch
& nvidia-smi
) The nvidia 1070 is the primary GPU and drives all 6 displays -- the two AMD GPUs are only used for video output connectivity.
Used these instructions: https://code.mendhak.com/run-stable-diffusion-on-ubuntu/ which I think is a manual installation?
Those are very old instructions - we've deprecated conda
support. If you want to try a reinstall, I'd suggest https://invoke-ai.github.io/InvokeAI/installation/020_INSTALL_MANUAL/#pip-install (since it sounds like you know your way around your system). But a fully automated install is also available, it just makes some assumptions/decisions for you. But I'm not saying you have to reinstall - just FYI.
The nvidia 1070 is the primary GPU and drives all 6 displays -- the two AMD GPUs are only used for video output connectivity.
I don't know if pytorch can automatically tell which one is the primary and which one isn't. There are a couple of things to check after activating your virtual env:
- do you have the CUDA version of pytorch?
pip freeze | grep torch
should show you the version. you're looking for acu116
in yourtorch
version spec - can you force Torch to use a specific GPU. pytorch itself has that ability, but I don't think we expose that in Invoke yet. what you could try is: run
python
, and try these commands:
import torch
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.get_device_capability()
and see if pytorch can even find your Nvidia GPU at all.
I do have an integrated amd graphics which it does try to fall back on and then I have 590 which rocm detects and tries to run but because they dropped support it crashes... so I'm back to using automatic1111 for now.
I'm going to pass the amd gpu to a kqemu environment which should make it invisible to the os for the point of the installer if I get cpu processing from that change then it will confirm your suspicion if I get cuda with my 2060 then they are two separate issues...
I should have that checked out in a day or two
On Sun., Dec. 18, 2022, 12:04 a.m. Eugene Brodsky, @.***> wrote:
@MordesMortes https://github.com/MordesMortes I see you also opened #2022 https://github.com/invoke-ai/InvokeAI/issues/2022 the other day. wondering if that might be related.
A hypothesis: if your machine has an integrated AMD GPU, the installer detects that, and installs a ROCm version of PyTorch. Then when pytorch loads, it falls back on CPU.
But that's only a hypothesis until we can test it.
— Reply to this email directly, view it on GitHub https://github.com/invoke-ai/InvokeAI/issues/2030#issuecomment-1356681857, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOETWKSA5EGPBI4HHTP6BW3WN2SPXANCNFSM6AAAAAATALIGEI . You are receiving this because you were mentioned.Message ID: @.***>
I'm going to pass the amd gpu to a kqemu environment which should make it invisible to the os for the point of the installer
Genius! :100:
very curious to see the results
But I'm not saying you have to reinstall - just FYI.
I don't mind trying to reinstall. I'll give that a try.
Well, I've tried all kinds of ways to install outside of conda, and it's just refusing to work. See #2094
Within conda....
can you force Torch to use a specific GPU. pytorch itself has that ability, but I don't think we expose that in Invoke yet. what you could try is: run python, and try these commands:
import torch torch.cuda.is_available() torch.cuda.device_count() torch.cuda.get_device_capability()
and see if pytorch can even find your Nvidia GPU at all.
That had an interesting result:
(invokeai) o4@TR4:/$ python
Python 3.10.6 (main, Oct 24 2022, 16:07:47) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
False
>>> torch.cuda.device_count()
0
>>> torch.cuda.get_device_capability()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/o4/anaconda3/envs/invokeai/lib/python3.10/site-packages/torch/cuda/__init__.py", line 357, in get_device_capability
prop = get_device_properties(device)
File "/home/o4/anaconda3/envs/invokeai/lib/python3.10/site-packages/torch/cuda/__init__.py", line 371, in get_device_properties
_lazy_init() # will define _get_device_properties
File "/home/o4/anaconda3/envs/invokeai/lib/python3.10/site-packages/torch/cuda/__init__.py", line 221, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
>>>
That last line in particular is interesting as hell: AssertionError: Torch not compiled with CUDA enabled
So while within the conda environment, I tried:
pip uninstall torch
and then pip install torch
And something very cool happened:
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_capability()
(6, 1)
>>>
Hey, it actually WORKED!
And then I tried running stable-diffusion, and guess what I saw?
>> Using device_type cuda
So there's my solution for this problem at least! I needed to uninstall and then reinstall torch while in the conda environment, and now I'm running on the GPU!
(And now I'm running out of GPU memory. Sigh. Seems like 8GB isn't enough to drive 6 screens and run stable-diffusion. But maybe restarting the PC will fix that. Or maybe I'll manage to use one of those memory optimization hacks out there as a workaround.)
TL;DR: For anyone experiencing this problem, you should try: pip uninstall torch
and then pip install torch
You guys can close this issue now if that's enough of a workaround for you.
Might want to at least add something about that in a readme file or something.
If you install invoke without first installing CUDA support for Nvidia driver, which needs to be installed seperately, it is not automatically installed just by installing video card driver, torch will be complied without CUDA support . Even installing CUDA support afterwards will not work unless torch is recompiled.
After installing the Cuda support, the above procedure
TL;DR: For anyone experiencing this problem, you should try: pip uninstall torch and then pip install torch
It should work.
I'm running Pop!_OS 22.04 (nvidia) and ran into all the issues above. The fix for me was completely removing my invokeai folder, rerunning install.sh, rebooting and then it would use cuda. I'm also running nvidia-driver-525. Hope that helps others.
what about AMD on arch linux? it worked on first install, and then didn't work again even after several reinstalls
/home/anonymous/*/invokeai/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.) return torch._C._cuda_getDeviceCount() > 0
it gives this error before falling back to using cpu
i assume it has something to do with how pytorch is installed
well it might also depend on your video card, I have a 590 and AMD dropped compute support for that card so it worked until I updated it and I had to buy a 2060 to continue which is unfortunate as AMD pulling stuff like that is part of the reason why NVIDIA has a lock on the compute market
So I would say check to see if you still have support
On Sun, Jan 15, 2023 at 3:57 PM tzwel @.***> wrote:
what about AMD on arch linux? it worked on first install, and then didn't work again even after several reinstalls
/home/anonymous/*/invokeai/.venv/lib/python3.10/site-packages/torch/cuda/init.py:88: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:110.) return torch._C._cuda_getDeviceCount() > 0
it gives this error before falling back to using cpu
i assume it has something to do with how pytorch is installed
— Reply to this email directly, view it on GitHub https://github.com/invoke-ai/InvokeAI/issues/2030#issuecomment-1383263063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOETWKU7MSNP7ZKNET5EZDLWSRXGDANCNFSM6AAAAAATALIGEI . You are receiving this because you were mentioned.Message ID: @.***>
Got this problem solved by updates in the meantime?
Got this problem solved by updates in the meantime?
Working okay for me with the workaround described in https://github.com/invoke-ai/InvokeAI/issues/2030#issuecomment-1360844441
So I haven't updated anything since then.
There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.