flash-attention
flash-attention copied to clipboard
torch.__version__ = 2.2.0+cu118 but showing Error: FlashAttention is only supported on CUDA 11.6 and above.
I receive the error below: RuntimeError: FlashAttention is only supported on CUDA 11.6 and above. Note: make sure nvcc has a supported version by running nvcc -V. But my pytorch version is 2.2.0+cu118, why is it giving me the error?
Does nvcc -V run?
I also be blocked by this error, although the "nvcc -V" is run.
My environment is "torch.version = 2.3.1+cu121"
My "nvcc -V" output as below:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
As mentioned in the README, we require CUDA (and nvcc) >= 11.6.
As mentioned in the README, we require CUDA (and nvcc) >= 11.6.
There is no rational for this.
https://github.com/Dao-AILab/flash-attention/blob/3cea2fb6ee54fb7e1aad9db6ac6c9331184b8647/setup.py#L417-L428
Read: to save CI time. Minor versions should be compatible
Could you please at least provide an environment variable to override the wheel URL or cuda_version,
eg: torch_cuda_version = parse(os.getenv('CUDA_VERSION', torch.version.cuda))
@lucy66666 to change nvcc version, you can run: conda install nvidia::cuda-toolkit=12.1