latent-diffusion icon indicating copy to clipboard operation
latent-diffusion copied to clipboard

Solving environment: failed [ResolvePackageNotFound: - cudatoolkit=11.3.1]

Open roimulia2 opened this issue 3 years ago • 8 comments

Hey there!

I'm using macOS Monterey. I installed Anaconda via the Installer (https://www.anaconda.com/products/individual) and then when running: conda env create -f environment.yaml

It fails with: Screen Shot 2022-02-07 at 22 33 50

Do you have any idea why?

roimulia2 avatar Feb 07 '22 20:02 roimulia2

Hi, I have the same problem,

I tried adding conda-forge to the environment.yaml channels, but still no luck. I ended up un-pinning it

diff --git a/environment.yaml b/environment.yaml
index f36b0e1..2620d59 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -1,11 +1,14 @@
 name: ldm
 channels:
   - pytorch
+  - nvidia
+  - conda-forge
+  - anaconda
   - defaults
 dependencies:
   - python=3.8.5
   - pip=20.3
-  - cudatoolkit=11.0
+  - cudatoolkit
   - pytorch=1.7.0
   - torchvision=0.8.1
   - numpy=1.19.2

https://stackoverflow.com/questions/64589421/packagesnotfounderror-cudatoolkit-11-1-0-when-installing-pytorch

david-wolgemuth avatar Apr 20 '22 18:04 david-wolgemuth

I had the same issue with macOS Monterey 12.3.1 on my Intel Mac. I couldn't find a channel could get cudatoolkit 11.0, so I let conda install whatever cudatoolkit it could, ended up being 9.0. Then when I ran python scripts/txt2img.py --prompt "a virus monster is playing guitar, oil on canvas" --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50 it eventually gave me "AssertionError: Torch not compiled with CUDA enabled".

I used the advice from here and downgraded torchvision to 0.6.0, but that version requires pytorch 1.5.0 instead of the 1.7.0 specified in the environment.yaml for this project, so running on torchvision0.6.0 and pytorch 1.5.0 threw "ModuleNotFoundError: No module named 'torch.optim.swa_utils'". Also, pytorch 1.5.0 has a conflict with pytorch lightning, so there are all kinds of issues here.

I am wondering if this comes from an issue with not finding cudatoolkit 11.0 or something else?

Joshua-Elms avatar Apr 27 '22 01:04 Joshua-Elms

I have same issue. is there any clear solution ?

changemin avatar May 18 '22 09:05 changemin

I do believe it to be an issue with cudatoolkit < 11.0; eventually I just ran it on Linux instead to avoid issues with Mac, and that worked. Fair warning though, it is very resource-intensive for GPU and even a friend of mine with a 2080 couldn't get it to run.

Joshua-Elms avatar May 18 '22 15:05 Joshua-Elms

I finally managed to make it work on MacOS. These are the steps:

  1. Remove/Comment in environment.yaml the line with cudatoolkit
  2. In environment.yaml change pytorch-lightning version to 1.6.1
  3. In txt2img.py comment model.cuda() (line 28)
  4. There are some instances in code where pytorch module is sent to device 'cuda'. Just change them to 'cpu', e.g. ldm/models/diffusion/ddim.py line 21

vladhondru25 avatar Jul 08 '22 15:07 vladhondru25

@vladhondru25 yep that worked for me, and just be clear to others, there are many spots you have to change the device from "cuda" to "cpu", not just the one you mentioned.

It's dog-slow, but it works, Thanks!

strawhatguy avatar Sep 08 '22 00:09 strawhatguy

should be added as a bugfix, someone should submit a pull request for this. i noticed some other projects that use stable diffusion do indeed have a cpu flag to tell it to use cpu instead. major fix for anyone using older macs, especially maybe artists that arent yet successful enough to upgrade their environment to something more cuda compatible.

jskye avatar Sep 27 '22 05:09 jskye

note it does do this check now on line 242: device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") however the line u mentioned is still calling cuda above it (now on line 63): model.cuda()

jskye avatar Sep 27 '22 05:09 jskye