vertex-ai-samples icon indicating copy to clipboard operation
vertex-ai-samples copied to clipboard

Pytorch version sync with Cuda

Open nadavw opened this issue 1 year ago • 5 comments

  1. The 'accelerate' lib is missing from the requirements.txt
  2. The Pytouch version isn't synced with the Cude version - I had to install the following: The default version was cu121, which didn't work and failed (RuntimeError: The NVIDIA driver on your system is too old) in this cell:
model_path = "model_artifacts"

pipe = StableDiffusionPipeline.from_pretrained(
    model_path, torch_dtype=torch.float16
).to("cuda")

g_cuda = None
import torch
print("PyTorch Version:", torch.__version__)

PyTorch Version: 2.1.2+cu121

after running the following commands and kernel restart it completed:

!pip uninstall torch -y
!pip cache purge
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Kernel restart

import torch
print("PyTorch Version:", torch.__version__)

PyTorch Version: 2.1.2+cu118

nadavw avatar Jan 28 '24 08:01 nadavw

@nadavw thank you for the information. Can you please let me know which notebook you referred to? Thanks.

gericdong avatar Feb 06 '24 22:02 gericdong

notebooks/community/vertex_endpoints/torchserve/dreambooth_stablediffusion.ipynb

nadavw avatar Feb 07 '24 07:02 nadavw

@telpirion can you please help take a look at this? Thanks.

gericdong avatar Feb 07 '24 15:02 gericdong

@gericdong I can't help, I'm afraid. Sorry. (This is also a community notebook, so the threshold for fixes is higher.)

It looks like there's a missing dependency (accelerate) and torch needs to be upgraded to a higher version. Should be a quick fix. @nadavw has provided a lot of the required changes in their detailed write up.

telpirion avatar Feb 07 '24 16:02 telpirion

@katiemn can you help take a look at this? Thanks.

gericdong avatar Feb 08 '24 15:02 gericdong