mish-cuda icon indicating copy to clipboard operation
mish-cuda copied to clipboard

How to install mish-cuda when cuda is 11.1?

Open yunxi1 opened this issue 4 years ago • 7 comments

my GPU is RTX 3090,so I have to use cuda 11, I already checked my cuda11.1 and it is useful, but when I use : pip install git+https://github.com/thomasbrandon/mish-cuda/ to insall mish-cuda, there is a error:

unable to execute ':/usr/local/cuda/bin/nvcc': No such file or directory error: command ':/usr/local/cuda/bin/nvcc' failed with exit status 1

ERROR: Failed building wheel for mish-cuda

what should I do?

yunxi1 avatar Nov 17 '20 08:11 yunxi1

That colon shouldn't be there. Looks like it's part of the CUDA path detected by torch.utils.cpp_extension (which does the compilation). Check if CUDA_HOME environment variable is set and verify value. Otherwise check which nvcc results. Those are the methods used for detection. Otherwise you'll have to look at the detection/compilation logic to see what's going wrong. See torch/utils/cpp_extension.py#L27.

thomasbrandon avatar Nov 18 '20 16:11 thomasbrandon

I have been running on cuda11.1 using pytorch1.6 but the compiled file fails with pytorch1.7 which officially supports cuda11. I am getting a function import failure with an unrecognized character from the compiled library so I downgraded back to pytorch 1.6...

This is the error I get:

ImportError: ~/.local/lib/python3.7/site-packages/mish_cuda/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c104impl23ExcludeDispatchKeyGuardC1ENS_11DispatchKeyE

rafale77 avatar Nov 20 '20 08:11 rafale77

@rafale77 Did you re-install the extension after upgrading PyTorch. I wouldn't expect binary compatibility across versions so you need to re-install to re-compile.

thomasbrandon avatar Nov 21 '20 10:11 thomasbrandon

Yes, I did, I tested it both ways: without recompiling, and with recompiling on pytorch 1.7, Same failure. Compiled by pytorch 1.6 and 1.7, both work fine on pytorch 1.6.

Edit: As I wrote the line above, I started suspecting a user error... that the recompiling actually did not occur because the original binary was not overwritten so I redid it after uninstalling it first and it seems to have addressed the issue.

rafale77 avatar Nov 21 '20 13:11 rafale77

I change my cuda to 11.0, then set environment variable export CUDA_HOME=/usr/local/cuda new error appeared: nvcc fatal : Unsupported gpu architecture 'compute_86' error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1 so I add environment variable:export TORCH_CUDA_ARCH_LIST="7.5" downgrade version it works!

yunxi1 avatar Jan 07 '21 08:01 yunxi1

hi, I tried: -export CUDA_HOME=$CUDA_HOME:/usr/local/cuda +export CUDA_HOME=/usr/local/cuda it works

Sukeysun avatar Oct 26 '21 08:10 Sukeysun

hi, I tried: -export CUDA_HOME=$CUDA_HOME:/usr/local/cuda +export CUDA_HOME=/usr/local/cuda it works It really works bro!

meanmee avatar Nov 22 '21 02:11 meanmee