flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Issue with installing flash attention ` import flash_attn_2_cuda as flash_attn_cuda`

Open hahmad2008 opened this issue 1 year ago • 11 comments
trafficstars

Gemma2 need torch>=2.4.0 as this mentioned Because when I run it I get this error:

  File "/usr/local/lib/python3.10/dist-packages/transformers/cache_utils.py", line 1656, in __init__
    torch._dynamo.mark_static_address(new_layer_key_cache)
AttributeError: module 'torch._dynamo' has no attribute 'mark_static_address'

So this need torch>=2.4.0 but the current version is the following:

>>> import torch;torch.__version__
'2.0.1+cu117'
>>> import flash_attn;flash_attn.__version__
'2.5.6'

The problem is when I tried to install torch with this version '2.4.0+cu118' while I have

root@0d6c1aeee409:/space/LongLM# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

I got this error:

>>> import flash_attn;flash_attn.__version__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

SO I uninstall and install flashattention like the following:

pip uninstall flash-attn
pip install  --no-build-isolation  flash-attn==2.5.6  -U --force-reinstall

However this will uninstall the current torch and install torch '2.5.1+cu124' and still i have this issue again:

 import flash_attn_2_cuda as flash_attn_cuda
ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

So I can't install it!

hahmad2008 avatar Nov 20 '24 17:11 hahmad2008

You should

  • Install torch 2.4 (if that's the verison you want)
  • Install flash-attn (latest version 2.7.0.post2 should work)

tridao avatar Nov 20 '24 18:11 tridao

You should

  • Install torch 2.4 (if that's the verison you want)
  • Install flash-attn (latest version 2.7.0.post2 should work)

how do i know which flash-attn version and torch verison to install? i have the same problem,it's really annoying

root@e4b47fc2098b:/workspace/OpenRLHF# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_02:18:23_PDT_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0 root@e4b47fc2098b:/workspace/OpenRLHF# ^C root@e4b47fc2098b:/workspace/OpenRLHF# pip show torch Name: torch Version: 2.5.1 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: [email protected] License: BSD-3-Clause Location: /usr/local/lib/python3.10/dist-packages Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions Required-by: accelerate, bitsandbytes, compressed-tensors, deepspeed, flash-attn, lightning-thunder, openrlhf, optimum, peft, torch-tensorrt, torchmetrics, torchvision, vllm, xformers root@e4b47fc2098b:/workspace/OpenRLHF# pip show flash_attn Name: flash-attn Version: 2.7.0.post2 Summary: Flash Attention: Fast and Memory-Efficient Exact Attention Home-page: https://github.com/Dao-AILab/flash-attention Author: Tri Dao Author-email: [email protected] License: Location: /usr/local/lib/python3.10/dist-packages Requires: einops, torch Required-by: openrlhf

chuangzhidan avatar Dec 23 '24 16:12 chuangzhidan

Version: 2.7.0.post2

flash_attn-2.7.1.post4+cu12torch2.5cxx11abiTRUE-cp310-cp310-linux_x86_64.whl this one doesn't work either

chuangzhidan avatar Dec 24 '24 02:12 chuangzhidan

I met the same problem. Linux: cuda==12.2 env: torch1.13.1+cu11.7 python==3.10 flash_attn==2.5.9.post1+cu118torch1.13cp310 How can I resolve it?

import flash_attn_2_cuda as flash_attn_cuda
ImportError: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEE

Zhangxx1218 avatar Dec 24 '24 03:12 Zhangxx1218

I have the same problem.

How do i know which flash-attn version and torch version to install ?

vicstef1292 avatar Jan 13 '25 17:01 vicstef1292

The same problem.

Xuekai-Zhu avatar Jan 16 '25 14:01 Xuekai-Zhu

The same problem.

9050350 avatar Feb 25 '25 10:02 9050350

The same problem. I got:

..., line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory

cuda 12.6 python 3.12 torch 2.6.0 fa 2.7.3

Muleizhang avatar Sep 10 '25 12:09 Muleizhang

flash_attn 2.8.3 torch 2.8.0 cuda 12.8 vllm 0.10.2

(Worker pid=3831055) (EngineCore_DP0 pid=3841868)     import flash_attn_2_cuda as flash_attn_gpu
(Worker pid=3831055) (EngineCore_DP0 pid=3841868) ImportError: /volume/med-train/users/mzchen/miniconda3/envs/vllm/lib/python3.12/site-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEa

IcyFeather233 avatar Sep 22 '25 11:09 IcyFeather233

@IcyFeather233 I had the same issue with same libraries In my case recompiling pip install flash_attn -U --force-reinstall at the end, fixed this error

dgiofre avatar Oct 02 '25 17:10 dgiofre

@dgiofre @IcyFeather233

Have tried @dgiofre solution, but still failed.

Then I tried to reinstall the flash_attn from the source releases using the following code:

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp310-cp310-linux_x86_64.whl

which install the flash-attn with the tag abiTrue. It refers to the version with C++11 ABI compiling.

And it works.

For other libraries, please check the official releases at https://github.com/Dao-AILab/flash-attention/releases/tag/v2.8.3.

Y-Sui avatar Oct 14 '25 08:10 Y-Sui