model-angelo Issue with RTX3080

Initially the default pytorch isn't compatible with an RTX3080 GPU. So I edited the install.sh script:

Line 45 for my system should have read:

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 cudatoolkit=11.7 -c pytorch-nightly -c nvidia

After doing this it works fine. Before the version of PyTorch wasn't compatible with the GPU.

Just a heads up for anyone else with a 3000 series card.

Oct 17 '22 04:10 mbelouso

Hi,

That's interesting, what error message were you getting? Normally CUDA major releases should be interoperable.

Best, Kiarash.

Oct 17 '22 07:10 jamaliki

The error I got was the following

NVIDIA GeForce RTX 3080 Ti with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA GeForce RTX 3080 Ti GPU with PyTorch, please check the instructions at Start Locally | PyTorch 14

Oct 17 '22 09:10 mbelouso

I had exactly the same issues with my A100 cards. The above command fixed it. Thanks @mbelouso

Also, first model-angelo installed without any error, but when I ran the job it was really slow. It turned out that the cpu version of pytorch was installed, due to missing environmental variables for CUDA (CUDA_HOME,CUDA_LIB and PATH to bin ...). To check if one has the cuda or much slower cpu branch of pytorch, run this:

conda list | grep "^pytorch " | grep -E 'cuda|cpu'

Best, Jesper

Nov 10 '22 09:11 jelka71

Dear @mbelouso and @jelka71

I would like to thank @mbelouso for first noticing the issue and offering a fix. I have now pinned this issue. I am thinking about the best way to perhaps automate this CUDA mismatch in the installation script. If any of you have ideas, please let me know.

Best, Kiarash.

Nov 10 '22 12:11 jamaliki