flash-attention
flash-attention copied to clipboard
Issue with installing flash attention ` import flash_attn_2_cuda as flash_attn_cuda`
Gemma2 need torch>=2.4.0 as this mentioned Because when I run it I get this error:
File "/usr/local/lib/python3.10/dist-packages/transformers/cache_utils.py", line 1656, in __init__
torch._dynamo.mark_static_address(new_layer_key_cache)
AttributeError: module 'torch._dynamo' has no attribute 'mark_static_address'
So this need torch>=2.4.0 but the current version is the following:
>>> import torch;torch.__version__
'2.0.1+cu117'
>>> import flash_attn;flash_attn.__version__
'2.5.6'
The problem is when I tried to install torch with this version '2.4.0+cu118' while I have
root@0d6c1aeee409:/space/LongLM# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
I got this error:
>>> import flash_attn;flash_attn.__version__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/usr/local/lib/python3.10/dist-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
import flash_attn_2_cuda as flash_attn_cuda
ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
SO I uninstall and install flashattention like the following:
pip uninstall flash-attn
pip install --no-build-isolation flash-attn==2.5.6 -U --force-reinstall
However this will uninstall the current torch and install torch '2.5.1+cu124'
and still i have this issue again:
import flash_attn_2_cuda as flash_attn_cuda
ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
So I can't install it!
You should
- Install torch 2.4 (if that's the verison you want)
- Install flash-attn (latest version 2.7.0.post2 should work)
You should
- Install torch 2.4 (if that's the verison you want)
- Install flash-attn (latest version 2.7.0.post2 should work)
how do i know which flash-attn version and torch verison to install? i have the same problem,it's really annoying
root@e4b47fc2098b:/workspace/OpenRLHF# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_02:18:23_PDT_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0 root@e4b47fc2098b:/workspace/OpenRLHF# ^C root@e4b47fc2098b:/workspace/OpenRLHF# pip show torch Name: torch Version: 2.5.1 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: [email protected] License: BSD-3-Clause Location: /usr/local/lib/python3.10/dist-packages Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions Required-by: accelerate, bitsandbytes, compressed-tensors, deepspeed, flash-attn, lightning-thunder, openrlhf, optimum, peft, torch-tensorrt, torchmetrics, torchvision, vllm, xformers root@e4b47fc2098b:/workspace/OpenRLHF# pip show flash_attn Name: flash-attn Version: 2.7.0.post2 Summary: Flash Attention: Fast and Memory-Efficient Exact Attention Home-page: https://github.com/Dao-AILab/flash-attention Author: Tri Dao Author-email: [email protected] License: Location: /usr/local/lib/python3.10/dist-packages Requires: einops, torch Required-by: openrlhf
Version: 2.7.0.post2
flash_attn-2.7.1.post4+cu12torch2.5cxx11abiTRUE-cp310-cp310-linux_x86_64.whl this one doesn't work either
I met the same problem. Linux: cuda==12.2 env: torch1.13.1+cu11.7 python==3.10 flash_attn==2.5.9.post1+cu118torch1.13cp310 How can I resolve it?
import flash_attn_2_cuda as flash_attn_cuda
ImportError: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEE
I have the same problem.
How do i know which flash-attn version and torch version to install ?
The same problem.
The same problem.
The same problem. I got:
..., line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
cuda 12.6 python 3.12 torch 2.6.0 fa 2.7.3
flash_attn 2.8.3 torch 2.8.0 cuda 12.8 vllm 0.10.2
(Worker pid=3831055) (EngineCore_DP0 pid=3841868) import flash_attn_2_cuda as flash_attn_gpu
(Worker pid=3831055) (EngineCore_DP0 pid=3841868) ImportError: /volume/med-train/users/mzchen/miniconda3/envs/vllm/lib/python3.12/site-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEa
@IcyFeather233
I had the same issue with same libraries
In my case recompiling
pip install flash_attn -U --force-reinstall
at the end, fixed this error
@dgiofre @IcyFeather233
Have tried @dgiofre solution, but still failed.
Then I tried to reinstall the flash_attn from the source releases using the following code:
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiTRUE-cp310-cp310-linux_x86_64.whl
which install the flash-attn with the tag abiTrue. It refers to the version with C++11 ABI compiling.
And it works.
For other libraries, please check the official releases at https://github.com/Dao-AILab/flash-attention/releases/tag/v2.8.3.