flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

ImportError: /lib64/libc.so.6: version `GLIBC_2.32' not found

Open ffaisal93 opened this issue 6 months ago • 41 comments

Environment

  • OS / node image: CentOS 8 / RHEL 8 derivative
  • glibc version: 2.28 (python -c "import platform, os; print(platform.libc_ver())")
  • GPU: NVIDIA A100 (CUDA 12.6)
  • Python: 3.12 (Conda)
  • PyTorch: 2.7.0.dev+cu126 (nightly)
  • Flash-Attention wheel: 2.7.4.post1
    pip install https://github.com/OpenRLHF/flash-attn-…/flash_attn-2.7.4.post1+pt270cu126…whl

Reproduction

# activate env and run:
python -c "import flash_attn"

Error:

venvs/conda/openrlhf/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.)
  cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ImportError: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so)

Flash attention: pip install https://github.com/OpenRLHF/flash-attn-2.7.4.post1-builds/releases/download/v0.1/flash_attn-2.7.4.post1+pt270cu128cxx11abiTRUE-cp312-cp312-linux_x86_64.whl

Pip list:

einops==0.8.1
filelock==3.18.0
flash_attn==2.7.4.post1
fsspec==2025.5.1
Jinja2==3.1.6
MarkupSafe==3.0.2
mpmath==1.3.0
networkx==3.5
nvidia-cublas-cu12==12.6.4.1
nvidia-cuda-cupti-cu12==12.6.80
nvidia-cuda-nvrtc-cu12==12.6.77
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu12==9.5.1.17
nvidia-cufft-cu12==11.3.0.4
nvidia-cufile-cu12==1.11.1.6
nvidia-curand-cu12==10.3.7.77
nvidia-cusolver-cu12==11.7.1.2
nvidia-cusparse-cu12==12.5.4.2
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.26.2
nvidia-nvjitlink-cu12==12.6.85
nvidia-nvtx-cu12==12.6.77
pip==25.1
setuptools==78.1.1
sympy==1.14.0
torch==2.7.1
triton==3.3.1
typing_extensions==4.14.0
wheel==0.45.1

ffaisal93 avatar Jun 11 '25 05:06 ffaisal93

I am having the same error. Built flash attention from source yesterday and that appeared to go well.

Traceback (most recent call last):
  File "/home/ai/apps/flash-attention/tests/test_flash_attn.py", line 7, in <module>
    from flash_attn import (
  File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ImportError: /home/ai/miniconda3/envs/comfyenv/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so)

Ubuntu: 25.04 Python: 3.11.11 glibc: 2.41 GPU: RTX 5070 ti Driver Version: 570.124.06 CUDA Version: 12.8 Pytorch: 2.7.1+cu126

So I went down the rabbit hole and installed the latest pytorch 2.8.0.dev20250611+cu128 and got a very similar but different error.

Traceback (most recent call last):
  File "/home/ai/apps/flash-attention/tests/test_flash_attn.py", line 7, in <module>
    from flash_attn import (
  File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ImportError: /home/ai/miniconda3/envs/comfyenv/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.15' not found (required by /home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so)

There is also a really good description of the error and possible solutions on section 3.4 of this page (which I could not get working.) https://gcc.gnu.org/onlinedocs/libstdc++/faq.html#faq.how_to_install

bradison1 avatar Jun 11 '25 21:06 bradison1

I ended up solving this issue by burning the environment down and rebuilding it. For me, FlashAttention was built successfully in Python 13 and GCC 13/G++ 13 (which I don't believe is the default for Python 13). I still had problems in Python 12 for some reason.

I have several versions of GCC and G++ installed, so this article was instrumental in managing those and allowing me to experiment.

https://linuxconfig.org/how-to-switch-between-multiple-gcc-and-g-compiler-versions-on-ubuntu-20-04-lts-focal-fossa

I don't think is was needed, but I also threw these in for good measure:

export CC=/usr/bin/gcc-13
export CXX=/usr/bin/g++-13

Good luck!

bradison1 avatar Jun 13 '25 19:06 bradison1

Same issue for v2.8.0.post2 cu12torch2.7_abi=False. GPU is 5090, torch is v2.7.1+cu128. ubuntu 20.04

zipzou avatar Jun 15 '25 14:06 zipzou

Same issue for v2.8.0.post2 cu12torch2.5_abi=False. GPU is A100, torch is v2.5.1+cu121. ubuntu 20.04

YanxingLiu avatar Jun 16 '25 03:06 YanxingLiu

The issue in v2.8.0.post2 is caused by this commit https://github.com/Dao-AILab/flash-attention/commit/71f7ac258ac193bf2cecd2c82a0d6e22bcba157f

efsotr avatar Jun 16 '25 07:06 efsotr

Github runners now longer support ubuntu 20.04 so we had to compile the wheels with ubuntu 22.04. The right thing is to have the github runners compile with manylinux but I haven't figured out how to set that up

tridao avatar Jun 16 '25 16:06 tridao

But built upon GLIBC 2.32 could cause many compatibility problems, and torch need lower versions of glibc.

zipzou avatar Jun 17 '25 00:06 zipzou

Right, we'd appreciate help with configuring the github runner to use manylinux https://github.com/Dao-AILab/flash-attention/blob/main/.github/workflows/publish.yml

tridao avatar Jun 17 '25 04:06 tridao

same issue.

Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):
    from openrlhf.trainer.ray import (
  File "/tmp/ray/session_2025-06-18_02-19-51_497600_4018193/runtime_resources/working_dir_files/_ray_pkg_d41598a7255f91ed/openrlhf/trainer/__init__.py", line 1, in <module>
    from .dpo_trainer import DPOTrainer
  File "/tmp/ray/session_2025-06-18_02-19-51_497600_4018193/runtime_resources/working_dir_files/_ray_pkg_d41598a7255f91ed/openrlhf/trainer/dpo_trainer.py", line 5, in <module>
    from flash_attn.utils.distributed import all_gather
  File "/home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so)

how to solve that

dogeeelin avatar Jun 18 '25 02:06 dogeeelin

@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If used torch>=2.7, we have to compile it by ourselves.

zipzou avatar Jun 18 '25 02:06 zipzou

Same issue for v2.8.0.post2 cu12torch2.5_abi=False. GPU is A100, torch is v2.5.1+cu121. ubuntu 20.04v2.8.0.post2 cu12torch2.5_abi=False 也存在同样的问题。GPU 是 A100,torch 是 v2.5.1+cu121。Ubuntu 20.04

It worked!https://blog.csdn.net/qq_44817196/article/details/136203069

ykj467422034 avatar Jun 19 '25 04:06 ykj467422034

@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If used torch>=2.7, we have to compile it by ourselves.

Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation

rin2401 avatar Jun 19 '25 07:06 rin2401

pip install flash_attn==2.7.4.post1

This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31

ZetaJ7 avatar Jun 23 '25 08:06 ZetaJ7

Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.

ali-kerem avatar Jun 23 '25 14:06 ali-kerem

pip install flash_attn==2.7.4.post1

This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31

In this case, what's your pytorch version?

dogeeelin avatar Jun 24 '25 01:06 dogeeelin

Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.

In this case, what's your pytorch version?

dogeeelin avatar Jun 24 '25 01:06 dogeeelin

pip install flash_attn==2.7.4.post1 This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31

In this case, what's your pytorch version?

torch 2.4.0 with CUDA 12.1

ZetaJ7 avatar Jun 24 '25 02:06 ZetaJ7

Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.

In this case, what's your pytorch version?

2.7.1 with CUDA 12.6

ali-kerem avatar Jun 24 '25 08:06 ali-kerem

I had some issues installing flash attention directly. My setup is torch 2.7 with cuda 12.6. The following approach worked:

https://github.com/Dao-AILab/flash-attention/issues/1644#issuecomment-2899396361

safal312 avatar Jun 28 '25 11:06 safal312

pip install flash_attn==2.7.4.post1

This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31

This works for me. My setup is torch 2.5 with CUDA 12.1.

journey1234-liu avatar Jul 04 '25 09:07 journey1234-liu

pip install flash_attn==2.7.4.post1

you're the hero

jvonrad avatar Jul 05 '25 19:07 jvonrad

Same error for me, I solved this problem faster using pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiTRUE-cp39-cp39-linux_x86_64.whl.

You can select the whl file which matches with your environment.

pip install flash_attn==2.7.4.post1

This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31

kmuhan avatar Jul 06 '25 11:07 kmuhan

flash_attn == 2.7.4.post1

It works for me.

hairuoliu1 avatar Jul 09 '25 14:07 hairuoliu1

With all the "high fives" and workaround confirmations, I just wanted to highlight that this problem is still very much alive. The workaround is great if you are on 20.04, but if you are running a Blackwell card with Linux you are forced to use at least PyTorch 2.7 and CUDA 12.8, which is still an issue.

For me, I was also had to use Ubuntu 25.04 to even get my 5070 Ti card to be recognized by the OS, which complicated the GCC/G++ issue. My workaround is posted above, but I am limited to Python 13, which presents problems for some software packages.

@tridao's comment appears to be the correct fix, but it there doesn't seem to be much progress on it.

bradison1 avatar Jul 11 '25 22:07 bradison1

@bradison1 The best approach seems to be compiling the wheels on a manylinux + CUDA 12.9 environment to support PyTorch 2.7.x and Blackwell GPUs. However, I’m not sure whether GitHub Actions supports this setup.

zipzou avatar Jul 12 '25 01:07 zipzou

Thanks to everyone who looked into this!

cognivore avatar Jul 14 '25 18:07 cognivore

@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If used torch>=2.7, we have to compile it by ourselves.

Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation

Tried this but got the following error when import flash_attn:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/XXX/miniconda3/envs/RRC_DocVQA/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/XXX/miniconda3/envs/RRC_DocVQA/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
    import flash_attn_2_cuda as flash_attn_gpu
ModuleNotFoundError: No module named 'flash_attn_2_cuda'

Have anyone faced the same issue?

CQHofsns avatar Jul 23 '25 15:07 CQHofsns

Right, we'd appreciate help with configuring the github runner to use manylinux https://github.com/Dao-AILab/flash-attention/blob/main/.github/workflows/publish.yml

@tridao I am currently conducting compilation testing on the manylinux_2_28 platform and have made substantial progress. Upon successful validation of the compilation artifacts, I would submit a pull request to the current project. This approach is expected to provide a fundamental solution to the GLIBC compatibility issues.

zipzou avatar Jul 26 '25 16:07 zipzou

@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If used torch>=2.7, we have to compile it by ourselves.

Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation

解决啦

Ethan16162 avatar Aug 05 '25 08:08 Ethan16162

@zipzou That's great news! Looking forward to your fix.

bradison1 avatar Aug 06 '25 03:08 bradison1