flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Why does `nvidia-cuda-runtime-cu12` not work and must have `/usr/local/cuda` version greater than 11.6

Open lvzii opened this issue 10 months ago • 3 comments

Fail to install this package with error message below:

  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-316ekinv/flash-attn_8533b39ea95943b09a2457b1e0020eec/setup.py", line 115, in <module>
          raise RuntimeError(
      RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.
      
      
      torch.__version__  = 2.2.2+cu121
      
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

My /usr/local/cuda version is definitely less than 11.6, but from my point of view, /usr/local/cuda is a cuda runtime, python package 'nvidia-cuda-runtime-cu12' form nvidia is also a cuda runtime, why does nvidia-cuda-runtime-cu12 not work and must have /usr/local/cuda version greater than 11.6

Similar problems https://github.com/Dao-AILab/flash-attention/issues/842 https://github.com/Dao-AILab/flash-attention/issues/825 https://github.com/Dao-AILab/flash-attention/issues/557

lvzii avatar Apr 10 '24 08:04 lvzii

make sure nvcc has a supported version by running nvcc -V

tridao avatar Apr 10 '24 08:04 tridao

Thank you for your answer, but what I mean is: Apart from the installation check for the CUDA version, is the nvcc utilized elsewhere as well? Packages like torch don't check cuda runtime versions through nvcc.

lvzii avatar Apr 10 '24 09:04 lvzii

I came across this too because of having an older full toolkit that included nvcc installed on my system while my conda environment had version 12.1 of the cuda runtime. I ended up just removing both cuda installations completely and installing the full toolkit from here which includes nvcc https://developer.nvidia.com/cuda-12-1-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0 .

nvcc is used to compile when there isn't an available wheel or if the user chooses to build from source. I am not sure if there is another reason for it even when the wheel a matching wheel exists. If there isn't, then it would make sense to have that check moved to just prior to compiling. A lot of people install Cuda when they install Pytorch and I think when done that way, you just get the cuda runtime and not nvcc.

c1505 avatar Apr 24 '24 07:04 c1505