diffusers
diffusers copied to clipboard
Unable to correctly install dependencies for Dreambooth example on GCP user or managed notebooks
Describe the bug
I am attempting to simply run the Dreambooth training example on a GCP Vertex AI workbench notebook. I have tried their managed notebook and user-managed notebooks with the same issue. However, I can not seem to get the dependencies to align correctly.
I installed the dependencies, as instructed via:
!pip install git+https://github.com/huggingface/diffusers
!pip install -U -r diffusers/examples/dreambooth/requirements.txt
However, when I attempt to initialize an Accelerator environment, I get the following error:
!accelerate env
Traceback (most recent call last):
File "/opt/conda/bin/accelerate", line 5, in <module>
from accelerate.commands.accelerate_cli import main
File "/opt/conda/lib/python3.7/site-packages/accelerate/__init__.py", line 7, in <module>
from .accelerator import Accelerator
File "/opt/conda/lib/python3.7/site-packages/accelerate/accelerator.py", line 25, in <module>
import torch
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 191, in <module>
_load_global_deps()
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 153, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/opt/conda/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/jupyter/.local/lib/python3.7/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11
My environment looks like the following (from pip freeze):
Python: 3.7
---------------
torch==1.13.0
diffusers @ git+https://github.com/huggingface/diffusers@8171566163f0b197282786bf39de95c130eb5fa0
accelerate==0.14.0
torchvision==0.14.0
transformers>=4.21.0
ftfy==6.1.1
tensorboard==2.11.0
modelcards==0.1.6
This seems like a version compatibility issue between accelerate and pytorch, but I'm not sure the best way to go about resolving. I tried downgrading Pytorch to 1.9.0 at the suggestion of this StackOverflow with no luck.
Reproduction
No response
Logs
Traceback (most recent call last):
File "/opt/conda/bin/accelerate", line 5, in <module>
from accelerate.commands.accelerate_cli import main
File "/opt/conda/lib/python3.7/site-packages/accelerate/__init__.py", line 7, in <module>
from .accelerator import Accelerator
File "/opt/conda/lib/python3.7/site-packages/accelerate/accelerator.py", line 25, in <module>
import torch
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 191, in <module>
_load_global_deps()
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 153, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/opt/conda/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/jupyter/.local/lib/python3.7/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11
System Info
!diffusers-cli env
Traceback (most recent call last):
File "/opt/conda/bin/diffusers-cli", line 5, in <module>
from diffusers.commands.diffusers_cli import main
File "/opt/conda/lib/python3.7/site-packages/diffusers/__init__.py", line 1, in <module>
from .utils import (
File "/opt/conda/lib/python3.7/site-packages/diffusers/utils/__init__.py", line 44, in <module>
from .testing_utils import (
File "/opt/conda/lib/python3.7/site-packages/diffusers/utils/testing_utils.py", line 27, in <module>
import torch
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 191, in <module>
_load_global_deps()
File "/home/jupyter/.local/lib/python3.7/site-packages/torch/__init__.py", line 153, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/opt/conda/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/jupyter/.local/lib/python3.7/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11
Hey @DevonPeroutky there seems to be a problem with your torch install.
Can you try to just import Pytorch?
import torch
print(torch.__version__)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I seem to have similar error, my issue: https://github.com/huggingface/diffusers/issues/1750
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I met the same error. befor I installed xformers, my code run stably. but after I installed xformers, I met the same error. https://huggingface.co/docs/diffusers/optimization/fp16#memory-efficient-attention i just follow this guide to speed
Same error on GCP, Pytorch version : 1.13.1