flash-attention torch.__version__ = 2.2.0+cu118 but showing Error: FlashAttention is only supported on CUDA 11.6 and above.

torch.version = 2.2.0+cu118 but showing Error: FlashAttention is only supported on CUDA 11.6 and above.

Open lucy66666 opened this issue 1 year ago • 4 comments

I receive the error below: RuntimeError: FlashAttention is only supported on CUDA 11.6 and above. Note: make sure nvcc has a supported version by running nvcc -V. But my pytorch version is 2.2.0+cu118, why is it giving me the error?

Feb 13 '24 19:02 lucy66666

Does nvcc -V run?

Feb 13 '24 19:02 tridao

I also be blocked by this error, although the "nvcc -V" is run.

My environment is "torch.version = 2.3.1+cu121"

My "nvcc -V" output as below:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Jul 25 '24 01:07 YuboFeng2023

As mentioned in the README, we require CUDA (and nvcc) >= 11.6.

Jul 25 '24 01:07 tridao

As mentioned in the README, we require CUDA (and nvcc) >= 11.6.

There is no rational for this.

https://github.com/Dao-AILab/flash-attention/blob/3cea2fb6ee54fb7e1aad9db6ac6c9331184b8647/setup.py#L417-L428

Read: to save CI time. Minor versions should be compatible

Could you please at least provide an environment variable to override the wheel URL or cuda_version, eg: torch_cuda_version = parse(os.getenv('CUDA_VERSION', torch.version.cuda))

Aug 30 '24 05:08 drzraf

@lucy66666 to change nvcc version, you can run: conda install nvidia::cuda-toolkit=12.1

Jun 25 '25 14:06 ashok-arora

flash-attention flash-attention copied to clipboard

torch.__version__ = 2.2.0+cu118 but showing Error: FlashAttention is only supported on CUDA 11.6 and above.

flash-attention
flash-attention copied to clipboard

torch.version = 2.2.0+cu118 but showing Error: FlashAttention is only supported on CUDA 11.6 and above.