FBGEMM unable to import fbgemm

Hi folks! When I install fbgemm-gpu-genai from pip, I am unable to import the library. Neither stable or nightly versions work for me. Repro:

(pytorch) [[email protected] ~/local/ao (20250821_float8_tensor_fix)]$ with-proxy pip install --pre fbgemm-gpu-genai --index-url https://download.pytorch.org/whl/nightly/cu128/
Looking in indexes: https://download.pytorch.org/whl/nightly/cu128/
Collecting fbgemm-gpu-genai
  Using cached https://download.pytorch.org/whl/nightly/cu128/fbgemm_gpu_genai-2025.8.20%2Bcu128-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (2.7 kB)
Requirement already satisfied: numpy in /home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages (from fbgemm-gpu-genai) (2.2.3)
Using cached https://download.pytorch.org/whl/nightly/cu128/fbgemm_gpu_genai-2025.8.20%2Bcu128-cp311-cp311-manylinux_2_28_x86_64.whl (32.2 MB)
Installing collected packages: fbgemm-gpu-genai
Successfully installed fbgemm-gpu-genai-2025.8.20+cu128
(pytorch) [[email protected] ~/local/ao (20250821_float8_tensor_fix)]$ python
Python 3.11.0 (main, Mar  1 2023, 18:26:19) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fbgemm_gpu
ERROR:root:Could not load the library 'experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so'!


Could not load this library: /home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so



Traceback (most recent call last):
  File "/data/users/vasiliy/pytorch/torch/_ops.py", line 1487, in load_library
    ctypes.CDLL(path)
  File "/home/vasiliy/.conda/envs/pytorch/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so: undefined symbol: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcmPKcmm

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/__init__.py", line 90, in <module>
    _load_library(f"{library}.so", __variant__ == "docs")
  File "/home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/__init__.py", line 22, in _load_library
    raise error
  File "/home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/data/users/vasiliy/pytorch/torch/_ops.py", line 1489, in load_library
    raise OSError(f"Could not load this library: {path}") from e
OSError: Could not load this library: /home/vasiliy/.conda/envs/pytorch/lib/python3.11/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so
>>>

I am on an H100 with PyTorch built from source.

Aug 21 '25 11:08 vkuzo

I also repro on the same machine with PyTorch version '2.7.1+cu128'.

Aug 21 '25 11:08 vkuzo

The missing symbol _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcmPKcmm appears to indicate that the glibcxx that is installed on the system is old. Could you try this on a more recent OS version, i.e. one with glibc >=2.28 installed?

Aug 21 '25 21:08 q10

I have glibc 2.34:

(pytorch) [[email protected] ~/local/ao (20250821_float8_tensor_fix)]$ getconf GNU_LIBC_VERSION
glibc 2.34

Aug 22 '25 11:08 vkuzo

here is some more info about how I compiled PyTorch:

(pytorch) [[email protected] ~/local/pytorch (main)]$ gcc --version
gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-9)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(pytorch) [[email protected] ~/local/pytorch (main)]$ python -c "import torch; print(torch.__config__.show())" | grep -i cxx
  - Build settings: BUILD_TYPE=Release, COMMIT_SHA=49ff884b1edc3b872eeb2387ec60ef230cae7f24, CUDA_VERSION=12.6, CUDNN_VERSION=9.8.0, CXX_COMPILER=/usr/lib64/ccache/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.9.0, USE_CUDA=1, USE_CUDNN=ON, USE_CUSPARSELT=OFF, USE_EIGEN_FOR_BLAS=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=OFF, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF,

Aug 22 '25 12:08 vkuzo

we investigated this a bit more in preparation for the torchao v0.13.0 release, and here is what we observe:

PyTorch 2.7.1 and 2.8 + fbgemm_gpu stable -> import fbgemm_gpu works
recent PyTorch nightly + fbgemm_gpu nightly -> import fbgemm_gpu works
PyTorch 2.7.1 and 2.8 + fbgemm_gpu nightly -> import fbgemm_gpu leads to Aborted (core dumped)
recent PyTorch nightly + fbgemm_gpu stable -> import fbgemm_gpu leads to segmentation fault

is this a KP? Is there any issue we can follow to understand more?

Aug 27 '25 14:08 vkuzo

we investigated this a bit more in preparation for the torchao v0.13.0 release, and here is what we observe:

PyTorch 2.7.1 and 2.8 + fbgemm_gpu stable -> import fbgemm_gpu works

recent PyTorch nightly + fbgemm_gpu nightly -> import fbgemm_gpu works

PyTorch 2.7.1 and 2.8 + fbgemm_gpu nightly -> import fbgemm_gpu leads to Aborted (core dumped)

recent PyTorch nightly + fbgemm_gpu stable -> import fbgemm_gpu leads to segmentation fault

is this a KP? Is there any issue we can follow to understand more?

Yes, this observation is expected and in line with how the nightlies and stable releases are intended to work

PyTorch 2.7.1 and 2.8 + fbgemm_gpu stable -> This works bc fbgemm_gpu stable is built on torch stable, i.e. fbgemm 1.3.0 is built on torch 2.8
Recent PyTorch nightly + fbgemm_gpu nightly -> This also works bc fbgemm latest nightly is built on torch latest nightly, i.e. fbgemm_gpu nightly.2025.08.25 works with torch nightly.2025.08.25. But it also means that an old fbgemm_gpu nightly will likely not work with a newer torch nightly, and vice versa.
PyTorch 2.7.1 and 2.8 + fbgemm_gpu nightly -> This fails naturally, bc nightly depends on the ABI from torch nightly, which is ahead of torch stable releases
Recent PyTorch nightly + fbgemm_gpu stable -> Likewise, 1.3.0 stable relies on the ABI from torch 2.8 stable, whereas nightly is ahead

Aug 27 '25 17:08 q10

According to https://docs.pytorch.org/FBGEMM/general/Releases.html I use pytorch 2.9.1, python 3.13.7, cuda 13.0.2 and fbgemm-gpu-genai 1.4.1, however it still

In [1]:     from fbgemm_gpu.experimental.gen_ai.quantize import int4_row_quantize_zp, pack_int4
[11/14/25 20:53:42] ERROR    Could not load the library 'experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so'!                                                                     __init__.py:104


                             Could not load this library: /home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so

A more detailed repro:

In [4]:  torch.ops.load_library('/home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.13/site-packages/torch/_ops.py", line 1490, in load_library
    raise OSError(f"Could not load this library: {path}") from e
OSError: Could not load this library: /home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so

Nov 14 '25 13:11 Freed-Wu

What is the directory you are currently in? If you're in the project root directory, it may cause Python import confusion

Nov 15 '25 06:11 q10

What is the directory you are currently in?

/dev/shm, not the project directory.

Nov 15 '25 14:11 Freed-Wu

According to https://docs.pytorch.org/FBGEMM/general/Releases.html I use pytorch 2.9.1, python 3.13.7, cuda 13.0.2 and fbgemm-gpu-genai 1.4.1, however it still

In [1]:     from fbgemm_gpu.experimental.gen_ai.quantize import int4_row_quantize_zp, pack_int4
[11/14/25 20:53:42] ERROR    Could not load the library 'experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so'!                                                                     __init__.py:104


                             Could not load this library: /home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so

A more detailed repro:

In [4]:  torch.ops.load_library('/home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.13/site-packages/torch/_ops.py", line 1490, in load_library
    raise OSError(f"Could not load this library: {path}") from e
OSError: Could not load this library: /home/wzy/.local/lib/python3.13/site-packages/fbgemm_gpu/experimental/gen_ai/fbgemm_gpu_experimental_gen_ai.so

Could you paste out the full log of the error? The underlying error should have been printed out as well.

Nov 15 '25 23:11 q10

unable to import fbgemm_gpu