ImportError: /lib64/libc.so.6: version `GLIBC_2.32' not found
Environment
- OS / node image: CentOS 8 / RHEL 8 derivative
glibcversion: 2.28 (python -c "import platform, os; print(platform.libc_ver())")- GPU: NVIDIA A100 (CUDA 12.6)
- Python: 3.12 (Conda)
- PyTorch: 2.7.0.dev+cu126 (nightly)
- Flash-Attention wheel: 2.7.4.post1
pip install https://github.com/OpenRLHF/flash-attn-…/flash_attn-2.7.4.post1+pt270cu126…whl
Reproduction
# activate env and run:
python -c "import flash_attn"
Error:
venvs/conda/openrlhf/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ImportError: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by venvs/conda/openrlhf/lib/python3.12/site-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so)
Flash attention: pip install https://github.com/OpenRLHF/flash-attn-2.7.4.post1-builds/releases/download/v0.1/flash_attn-2.7.4.post1+pt270cu128cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
Pip list:
einops==0.8.1
filelock==3.18.0
flash_attn==2.7.4.post1
fsspec==2025.5.1
Jinja2==3.1.6
MarkupSafe==3.0.2
mpmath==1.3.0
networkx==3.5
nvidia-cublas-cu12==12.6.4.1
nvidia-cuda-cupti-cu12==12.6.80
nvidia-cuda-nvrtc-cu12==12.6.77
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu12==9.5.1.17
nvidia-cufft-cu12==11.3.0.4
nvidia-cufile-cu12==1.11.1.6
nvidia-curand-cu12==10.3.7.77
nvidia-cusolver-cu12==11.7.1.2
nvidia-cusparse-cu12==12.5.4.2
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.26.2
nvidia-nvjitlink-cu12==12.6.85
nvidia-nvtx-cu12==12.6.77
pip==25.1
setuptools==78.1.1
sympy==1.14.0
torch==2.7.1
triton==3.3.1
typing_extensions==4.14.0
wheel==0.45.1
I am having the same error. Built flash attention from source yesterday and that appeared to go well.
Traceback (most recent call last):
File "/home/ai/apps/flash-attention/tests/test_flash_attn.py", line 7, in <module>
from flash_attn import (
File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ImportError: /home/ai/miniconda3/envs/comfyenv/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so)
Ubuntu: 25.04 Python: 3.11.11 glibc: 2.41 GPU: RTX 5070 ti Driver Version: 570.124.06 CUDA Version: 12.8 Pytorch: 2.7.1+cu126
So I went down the rabbit hole and installed the latest pytorch 2.8.0.dev20250611+cu128 and got a very similar but different error.
Traceback (most recent call last):
File "/home/ai/apps/flash-attention/tests/test_flash_attn.py", line 7, in <module>
from flash_attn import (
File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ImportError: /home/ai/miniconda3/envs/comfyenv/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.15' not found (required by /home/ai/miniconda3/envs/comfyenv/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so)
There is also a really good description of the error and possible solutions on section 3.4 of this page (which I could not get working.) https://gcc.gnu.org/onlinedocs/libstdc++/faq.html#faq.how_to_install
I ended up solving this issue by burning the environment down and rebuilding it. For me, FlashAttention was built successfully in Python 13 and GCC 13/G++ 13 (which I don't believe is the default for Python 13). I still had problems in Python 12 for some reason.
I have several versions of GCC and G++ installed, so this article was instrumental in managing those and allowing me to experiment.
https://linuxconfig.org/how-to-switch-between-multiple-gcc-and-g-compiler-versions-on-ubuntu-20-04-lts-focal-fossa
I don't think is was needed, but I also threw these in for good measure:
export CC=/usr/bin/gcc-13
export CXX=/usr/bin/g++-13
Good luck!
Same issue for v2.8.0.post2 cu12torch2.7_abi=False. GPU is 5090, torch is v2.7.1+cu128. ubuntu 20.04
Same issue for v2.8.0.post2 cu12torch2.5_abi=False. GPU is A100, torch is v2.5.1+cu121. ubuntu 20.04
The issue in v2.8.0.post2 is caused by this commit https://github.com/Dao-AILab/flash-attention/commit/71f7ac258ac193bf2cecd2c82a0d6e22bcba157f
Github runners now longer support ubuntu 20.04 so we had to compile the wheels with ubuntu 22.04. The right thing is to have the github runners compile with manylinux but I haven't figured out how to set that up
But built upon GLIBC 2.32 could cause many compatibility problems, and torch need lower versions of glibc.
Right, we'd appreciate help with configuring the github runner to use manylinux https://github.com/Dao-AILab/flash-attention/blob/main/.github/workflows/publish.yml
same issue.
Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):
from openrlhf.trainer.ray import (
File "/tmp/ray/session_2025-06-18_02-19-51_497600_4018193/runtime_resources/working_dir_files/_ray_pkg_d41598a7255f91ed/openrlhf/trainer/__init__.py", line 1, in <module>
from .dpo_trainer import DPOTrainer
File "/tmp/ray/session_2025-06-18_02-19-51_497600_4018193/runtime_resources/working_dir_files/_ray_pkg_d41598a7255f91ed/openrlhf/trainer/dpo_trainer.py", line 5, in <module>
from flash_attn.utils.distributed import all_gather
File "/home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/linyun/anaconda3/envs/r1-searcher/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so)
how to solve that
@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If used torch>=2.7, we have to compile it by ourselves.
Same issue for v2.8.0.post2 cu12torch2.5_abi=False. GPU is A100, torch is v2.5.1+cu121. ubuntu 20.04v2.8.0.post2 cu12torch2.5_abi=False 也存在同样的问题。GPU 是 A100,torch 是 v2.5.1+cu121。Ubuntu 20.04
It worked!https://blog.csdn.net/qq_44817196/article/details/136203069
@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of
glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If usedtorch>=2.7, we have to compile it by ourselves.
Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation
pip install flash_attn==2.7.4.post1
This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31
Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.
pip install flash_attn==2.7.4.post1
This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31
In this case, what's your pytorch version?
Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.
In this case, what's your pytorch version?
pip install flash_attn==2.7.4.post1 This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31
In this case, what's your pytorch version?
torch 2.4.0 with CUDA 12.1
Installing 2.7.4.post1 worked for me. Also, the cluster environment that I am working on has gcc/8.5.0 as default, I needed to load 9.5 or something higher everytime to make it work.
In this case, what's your pytorch version?
2.7.1 with CUDA 12.6
I had some issues installing flash attention directly. My setup is torch 2.7 with cuda 12.6. The following approach worked:
https://github.com/Dao-AILab/flash-attention/issues/1644#issuecomment-2899396361
pip install flash_attn==2.7.4.post1
This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31
This works for me. My setup is torch 2.5 with CUDA 12.1.
pip install flash_attn==2.7.4.post1
you're the hero
Same error for me, I solved this problem faster using pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiTRUE-cp39-cp39-linux_x86_64.whl.
You can select the whl file which matches with your environment.
pip install flash_attn==2.7.4.post1
This downgrade flash_attn from 2.8.1.post2 to 2.7.4.post1 works for my ubuntu20.04, whose GLIBC is 2.31
flash_attn == 2.7.4.post1
It works for me.
With all the "high fives" and workaround confirmations, I just wanted to highlight that this problem is still very much alive. The workaround is great if you are on 20.04, but if you are running a Blackwell card with Linux you are forced to use at least PyTorch 2.7 and CUDA 12.8, which is still an issue.
For me, I was also had to use Ubuntu 25.04 to even get my 5070 Ti card to be recognized by the OS, which complicated the GCC/G++ issue. My workaround is posted above, but I am limited to Python 13, which presents problems for some software packages.
@tridao's comment appears to be the correct fix, but it there doesn't seem to be much progress on it.
@bradison1 The best approach seems to be compiling the wheels on a manylinux + CUDA 12.9 environment to support PyTorch 2.7.x and Blackwell GPUs. However, I’m not sure whether GitHub Actions supports this setup.
Thanks to everyone who looked into this!
@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of
glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If usedtorch>=2.7, we have to compile it by ourselves.Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation
Tried this but got the following error when import flash_attn:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/XXX/miniconda3/envs/RRC_DocVQA/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/XXX/miniconda3/envs/RRC_DocVQA/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 15, in <module>
import flash_attn_2_cuda as flash_attn_gpu
ModuleNotFoundError: No module named 'flash_attn_2_cuda'
Have anyone faced the same issue?
Right, we'd appreciate help with configuring the github runner to use manylinux https://github.com/Dao-AILab/flash-attention/blob/main/.github/workflows/publish.yml
@tridao I am currently conducting compilation testing on the manylinux_2_28 platform and have made substantial progress. Upon successful validation of the compilation artifacts, I would submit a pull request to the current project. This approach is expected to provide a fundamental solution to the GLIBC compatibility issues.
@dogeeelin Currently, the released wheel is built upon ubuntu 22.04, which has higher version of
glibc. Before compiling env solved, you can build the wheels from sources or try to use another released wheel. If usedtorch>=2.7, we have to compile it by ourselves.Thank you, this worked when i downgrade to flash-attn=2.7.4.post1 complied with torch <= 2.6 and ubuntu 20.04 https://github.com/Dao-AILab/flash-attention/releases/tag/v2.7.4.post1
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 pip install flash-attn==2.7.4.post1 --no-build-isolation
解决啦
@zipzou That's great news! Looking forward to your fix.