flash-attention
flash-attention copied to clipboard
flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol
how do I fix with it?
Which version of pytorch and CUDA are you using? Which version of flash attention are you using?
torch2.1.0 cuda12.1 and flash-attn 2.4.2
I encountered a similar error while attempting to install the package using the command
pip install flash_attn-2.5.5+cu118torch2.2cxx11abiTRUE-cp38-cp38-linux_x86_64.whl
However, the issue was resolved by using the command
pip install flash_attn-2.5.5+cu118torch2.2cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
PyTorch version 2.2.0
CUDA version 11.8
Python version 3.8.18
I also encountered the issue. Could you please advise how to fix the issue? Thanks.
import torch
import flash_attn_2_cuda as flash_attn_cuda
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /XXXXX/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
My environment is as follows:
- OS: Red Hat Enterprise Linux 8.4
- Python: 3.11.7
- PyTorch: 2.3.1+cu121
- CUDA: 12.2
- flash-attn: 2.5.9.post1
I just found a solution from here: https://github.com/Dao-AILab/flash-attention/issues/620 CUDA needs to be downgraded to version 11, then install this module instead: https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.8/flash_attn-2.5.8+cu118torch2.3cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.8/flash_attn-2.5.8+cu118torch2.3cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
for anyone who is here after trying to setup unsloth on runpod or primeintellect, both of those services have an "axolotl" docker image where flash-attn is already working. Use that, then run:
pip install "unsloth[cu118-ampere-torch211] @ git+https://github.com/unslothai/unsloth.git" --no-dependencies --disable-pip-version-check
pip install trl --no-dependencies --disable-pip-version-check --upgrade
pip install transformers --no-dependencies --disable-pip-version-check --upgrade
when trying to run deepspeed in the bare bones runpod pytorch instances, i finally got flash attention installed by starting with the pytorch 2.4.0 image, then:
- deepspeed==0.16.3
- flash-attn==2.7.3
- torch==2.5.1
- cuda V12.4.131
- python 3.11.10
I didn't downgrade anything; I just found, on the flash_attn releases page, a version suitable for my Python, CUDA, and PyTorch setup, compiled with _GLIBCXX_USE_CXX11_ABI=0 (cxx11abiFALSE in the wheel name).
Example:
- Python 3.10.2
- CUDA 12.4
- Torch 2.6.0
- flash_attn 2.7.3.
For that setup, install it like this:
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.3/flash_attn-2.7.3+cu12torch2.6cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
Thanks to @nspyf and @ghost for pointing it out.
Connected issues
- https://github.com/Dao-AILab/flash-attention/issues/620
Same issue here.
+1
I went to https://github.com/Dao-AILab/flash-attention/releases/ and chose a version which made sense to me. This is what worked for me.
I downloaded flash_attn-2.7.4.post1+cu12torch2.6cxx11abiTRUE-cp39-cp39-linux_x86_64.whl from releases and downloaded the wheel file. Installed it.
! pip install flash_attn-2.7.4.post1+cu12torch2.6cxx11abiTRUE-cp39-cp39-linux_x86_64.whl
My system: OS: Ubuntu 22.04
pytorch: 2.7.0+cu126 python: 3.9.19 (main, May 6 2024, 19:43:03) [GCC 11.2.0] CUDA: 12.2
For those who fall in my case:
What I did
FROM axolotlai/axolotl-base:main-base-py3.11-cu124-2.6.0
RUN pip install --no-build-isolation axolotl[flash-attn,deepspeed]==0.12.2
...
error message:
>>> import flash_attn
flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
How I fixed
I just removed flash-attn in the bracket(dependencies). It's already installed in the base image in my setup, but gets replaced with the above command.
FROM axolotlai/axolotl-base:main-base-py3.11-cu124-2.6.0
RUN pip install --no-build-isolation axolotl[deepspeed]==0.12.2
...
Same issue here
how to fix for torch2.6