apex
apex copied to clipboard
ModuleNotFoundError: No module named 'fused_layer_norm_cuda', ubuntu 22.04, Successfully installed apex-0.1
Describe the Bug ModuleNotFoundError: No module named 'fused_layer_norm_cuda', apex installed (pip3 install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--fast_layer_norm" ./)
When compiling, it was required to add to the files:
apex/contrib/csrc/layer_norm/ln.h -> add #include
Successfully installed apex-0.1
Minimal Steps/Code to Reproduce the Bug from transformers import T5ForConditionalGeneration model = T5ForConditionalGeneration.from_pretrained("t5-small")
Expected Behavior No error ModuleNotFoundError: No module named 'fused_layer_norm_cuda'
Environment UBUNTU 22.04 CUDA Version: 11.7, pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116 apex-0.1
python3 -m torch.utils.collect_env Collecting environment information... PyTorch version: 1.12.1+cu116 Is debug build: False CUDA used to build PyTorch: 11.6 ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.1 LTS (x86_64) GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.35
Python version: 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.0-47-generic-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090 Nvidia driver version: 515.65.01 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.5.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.5.0 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
Versions of relevant libraries: [pip3] numpy==1.23.3 [pip3] torch==1.12.1+cu116 [pip3] torchaudio==0.12.1+cu116 [pip3] torchvision==0.13.1+cu116 [conda] Could not collect
Could you try installing apex with --global-option="--cuda_ext"
option as well?
fused_layer_norm_cuda
would not be installed with "--fast_layer_norm" option but "--cuda_ext".
Thanks, it helped. Previously, I always did without this parameter and it worked. In the script, you have to comment out the line that swears at the version.
Any chance to get this included in master?