apex ImportError `undefined symbol` of `fused_layer_norm

Describe the Bug

Minimal Steps/Code to Reproduce the Bug

I've followed the installation instruction in the README, but an ImportError occurs when I import fused_layer_norm_cuda. I think the problem may be caused by version conflicts between CUDA, torch, and GCC; however, I don't find any specific version dependency. 😵‍💫

gcc --version
python -c "import torch; print(torch.__version__); print(torch.version.cuda); import fused_layer_norm_cuda"

gcc (GCC) 9.3.0                                                                                                                                                                                                                                                                                                         
Copyright (C) 2019 Free Software Foundation, Inc.                                                                                                                                                                                                                                                                       
This is free software; see the source for copying conditions.  There is NO                                                                                                                                                                                                                                              
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                                                  
1.11.0+cu113                                                                                                                                                                                                                                                                                                            
11.3            
                                                                                                                                                                                                                                                                                                        
Traceback (most recent call last):                                                                                                                                                                                                                                                                                      
  File "<string>", line 1, in <module>                                                                                                                                                                                                                                                                                  
ImportError: /mnt/lustre/sjtu/home/zcz72/anaconda3/envs/OFA3.9New/lib/python3.9/site-packages/fused_layer_norm_cuda.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNSt18basic_stringstreamIcSt11char_traitsIcESaIcEEC1Ev

Expected Behavior

Environment

PyTorch version: 1.11.0+cu113                                                                                                                                                                                                                                                                                           
Is debug build: False                                                                                                                                                                                                                                                                                                   
CUDA used to build PyTorch: 11.3                                                                                                                                                                                                                                                                                        
ROCM used to build PyTorch: N/A                                                                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                                                        
OS: CentOS Linux 7 (Core) (x86_64)                                                                                                                                                                                                                                                                                      
GCC version: (GCC) 9.3.0                                                                                                                                                                                                                                                                                                
Clang version: Could not collect                                                                                                                                                                                                                                                                                        
CMake version: Could not collect                                                                                                                                                                                                                                                                                        
Libc version: glibc-2.17                                                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                                                                        
Python version: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21)  [GCC 10.3.0] (64-bit runtime)                                                                                                                                                                                                         
Python platform: Linux-3.10.0-693.el7.x86_64-x86_64-with-glibc2.17                                                                                                                                                                                                                                                      
Is CUDA available: True                                                                                                                                                                                                                                                                                                 
CUDA runtime version: 11.3.58                                                                                                                                                                                                                                                                                           
GPU models and configuration: GPU 0: Tesla V100-PCIE-32GB                                                                                                                                                                                                                                                               
Nvidia driver version: 460.73.01                                                                                                                                                                                                                                                                                        
cuDNN version: Could not collect                                                                                                                                                                                                                                                                                        
HIP runtime version: N/A                                                                                                                                                                                                                                                                                                
MIOpen runtime version: N/A                                                                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                                                        
Versions of relevant libraries:                                                                                                                                                                                                                                                                                         
[pip3] numpy==1.23.1                                                                                                                                                                                                                                                                                                    
[pip3] pytorch-lightning==1.0.8                                                                                                                                                                                                                                                                                         
[pip3] torch==1.11.0+cu113                                                                                                                                                                                                                                                                                              
[pip3] torchmetrics==0.9.3                                                                                                                                                                                                                                                                                              
[pip3] torchvision==0.12.0+cu113                                                                                                                                                                                                                                                                                        
[conda] numpy                     1.23.1           py39hba7629e_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge                                                                                                                                                                                    
[conda] pytorch-lightning         1.0.8                    pypi_0    pypi                                                                                                                                                                                                                                               
[conda] torch                     1.11.0+cu113             pypi_0    pypi                                                                                                                                                                                                                                               
[conda] torchmetrics              0.9.3                    pypi_0    pypi                                                                                                                                                                                                                                               
[conda] torchvision               0.12.0+cu113             pypi_0    pypi

Nov 05 '22 10:11 JamesZhutheThird

I'm having this issue too.

Feb 22 '23 21:02 avivbrokman

Has anyone found a solution/workaround to this bug?

Jul 17 '23 05:07 XiaohanZhangCMU

@XiaohanZhangCMU I got the bug when I was using a Singularity (now Apptainer) container with CUDA 11.7 and pytorch compiled for CUDA 11.7 while using my University's cluster that has CUDA 11.1. I was able to fix the issue by making a container that had CUDA 11.1 on it instead. That's obviously not a great solution.

Jul 17 '23 11:07 avivbrokman

ImportError `undefined symbol` of `fused_layer_norm_cuda`