ERROR: Failed building wheel for transformer-engine
I am trying to install TransformerEngine using following :
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
facing following error
RuntimeError: Error when running CMake: Command '['/tmp/pip-req-build-wpw9pxi1/.eggs/cmake-3.28.3-py3.11-linux-x86_64.egg/cmake/data/bin/cmake', '-S', '/tmp/pip-req-build-wpw9pxi1/transformer_engine', '-B', '/tmp/tmps_krasnv', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-wpw9pxi1/build/lib.linux-x86_64-cpython-311', '-Dpybind11_DIR=/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/pybind11/share/cmake/pybind11']' returned non-zero exit status 1.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for transformer-engine
Running setup.py clean for transformer-engine
Failed to build transformer-engine
ERROR: Could not build wheels for transformer-engine, which is required to install pyproject.toml-based projects
It looks like there's a compilation error when building the core C++ library. Can you provide more of the error message so we can figure out where the error is coming from? I wonder if it's that same as https://github.com/NVIDIA/TransformerEngine/issues/694.
`Collecting git+https://github.com/NVIDIA/TransformerEngine.git@stable Cloning https://github.com/NVIDIA/TransformerEngine.git (to revision stable) to /tmp/pip-req-build-fgxtbhtl Running command git clone --filter=blob:none --quiet https://github.com/NVIDIA/TransformerEngine.git /tmp/pip-req-build-fgxtbhtl Running command git checkout -b stable --track origin/stable Switched to a new branch 'stable' Branch 'stable' set up to track remote branch 'stable' from 'origin'. Resolved https://github.com/NVIDIA/TransformerEngine.git to commit 5b90b7f5ed67b373bc5f843d1ac3b7a8999df08e Running command git submodule update --init --recursive -q Preparing metadata (setup.py) ... done Requirement already satisfied: pydantic in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from transformer-engine==1.3.0+5b90b7f) (2.6.3) Requirement already satisfied: torch in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from transformer-engine==1.3.0+5b90b7f) (2.2.1) Collecting flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6 (from transformer-engine==1.3.0+5b90b7f) Using cached flash_attn-2.4.2-cp311-cp311-linux_x86_64.whl Requirement already satisfied: einops in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer-engine==1.3.0+5b90b7f) (0.7.0) Requirement already satisfied: packaging in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer-engine==1.3.0+5b90b7f) (23.2) Collecting ninja (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer-engine==1.3.0+5b90b7f) Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB) Requirement already satisfied: annotated-types>=0.4.0 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from pydantic->transformer-engine==1.3.0+5b90b7f) (0.6.0) Requirement already satisfied: pydantic-core==2.16.3 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from pydantic->transformer-engine==1.3.0+5b90b7f) (2.16.3) Requirement already satisfied: typing-extensions>=4.6.1 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from pydantic->transformer-engine==1.3.0+5b90b7f) (4.10.0) Requirement already satisfied: filelock in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (3.13.1) Requirement already satisfied: sympy in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (1.12) Requirement already satisfied: networkx in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (3.2.1) Requirement already satisfied: jinja2 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (3.1.3) Requirement already satisfied: fsspec in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (2024.2.0) Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.105) Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.105) Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.105) Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (8.9.2.26) Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.3.1) Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (11.0.2.54) Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (10.3.2.106) Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (11.4.5.107) Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.0.106) Requirement already satisfied: nvidia-nccl-cu12==2.19.3 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (2.19.3) Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (12.1.105) Requirement already satisfied: triton==2.2.0 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from torch->transformer-engine==1.3.0+5b90b7f) (2.2.0) Requirement already satisfied: nvidia-nvjitlink-cu12 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch->transformer-engine==1.3.0+5b90b7f) (12.3.101) Requirement already satisfied: MarkupSafe>=2.0 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from jinja2->torch->transformer-engine==1.3.0+5b90b7f) (2.1.5) Requirement already satisfied: mpmath>=0.19 in ./anaconda3/envs/NeMo/lib/python3.11/site-packages (from sympy->torch->transformer-engine==1.3.0+5b90b7f) (1.3.0) Using cached ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB) Building wheels for collected packages: transformer-engine Building wheel for transformer-engine (setup.py) ... error error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [163 lines of output] /home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/init.py:80: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated. !!
********************************************************************************
Requirements should be satisfied by a PEP 517 installer.
If you are using pip, you can try `pip install --use-pep517`.
********************************************************************************
!!
dist.fetch_build_eggs(dist.setup_requires)
running bdist_wheel
/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/torch/utils/cpp_extension.py:500: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-311
creating build/lib.linux-x86_64-cpython-311/transformer_engine
copying transformer_engine/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine
creating build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/float8_tensor.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/utils.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/constants.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/attention.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/transformer.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/numerics_debug.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/jit.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/te_onnx_extensions.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/distributed.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/softmax.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/export.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/cpu_offload.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
copying transformer_engine/pytorch/fp8.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch
creating build/lib.linux-x86_64-cpython-311/transformer_engine/common
copying transformer_engine/common/utils.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/common
copying transformer_engine/common/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/common
copying transformer_engine/common/recipe.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/common
creating build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/utils.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/constants.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/recompute.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/cpp_extensions.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/profile.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/fp8_buffer.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/distributed.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
copying transformer_engine/paddle/fp8.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle
creating build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/sharding.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/layernorm.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/cpp_extensions.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/fused_attn.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/dot.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/mlp.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/softmax.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
copying transformer_engine/jax/fp8.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax
creating build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/layernorm_linear.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/_common.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/linear.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/layernorm.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/rmsnorm.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/layernorm_mlp.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
copying transformer_engine/pytorch/module/base.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/module
creating build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/transpose.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/normalization.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/fused_attn.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/cast.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/activation.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
copying transformer_engine/pytorch/cpp_extensions/gemm.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/pytorch/cpp_extensions
creating build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/attention.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/layernorm_linear.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/linear.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/layernorm.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/transformer.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/softmax.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/layernorm_mlp.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
copying transformer_engine/paddle/layer/base.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/paddle/layer
creating build/lib.linux-x86_64-cpython-311/transformer_engine/jax/praxis
copying transformer_engine/jax/praxis/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/praxis
copying transformer_engine/jax/praxis/module.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/praxis
copying transformer_engine/jax/praxis/transformer.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/praxis
creating build/lib.linux-x86_64-cpython-311/transformer_engine/jax/flax
copying transformer_engine/jax/flax/__init__.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/flax
copying transformer_engine/jax/flax/module.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/flax
copying transformer_engine/jax/flax/transformer.py -> build/lib.linux-x86_64-cpython-311/transformer_engine/jax/flax
running build_ext
Building CMake extension transformer_engine
Running command /tmp/pip-req-build-fgxtbhtl/.eggs/cmake-3.28.3-py3.11-linux-x86_64.egg/cmake/data/bin/cmake -S /tmp/pip-req-build-fgxtbhtl/transformer_engine -B /tmp/tmpfzxgbal5 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-fgxtbhtl/build/lib.linux-x86_64-cpython-311 -Dpybind11_DIR=/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/pybind11/share/cmake/pybind11
-- The CUDA compiler identification is unknown
-- The CXX compiler identification is GNU 11.4.0
CMake Error at CMakeLists.txt:15 (project):
No CMAKE_CUDA_COMPILER could be found.
Tell CMake where to find the compiler by setting either the environment
variable "CUDACXX" or the CMake cache entry CMAKE_CUDA_COMPILER to the full
path to the compiler, or to the compiler name if it is in the PATH.
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "/tmp/pip-req-build-fgxtbhtl/setup.py", line 353, in _build_cmake
subprocess.run(command, cwd=build_dir, check=True)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/subprocess.py", line 569, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/tmp/pip-req-build-fgxtbhtl/.eggs/cmake-3.28.3-py3.11-linux-x86_64.egg/cmake/data/bin/cmake', '-S', '/tmp/pip-req-build-fgxtbhtl/transformer_engine', '-B', '/tmp/tmpfzxgbal5', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-fgxtbhtl/build/lib.linux-x86_64-cpython-311', '-Dpybind11_DIR=/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/pybind11/share/cmake/pybind11']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-req-build-fgxtbhtl/setup.py", line 626, in <module>
main()
File "/tmp/pip-req-build-fgxtbhtl/setup.py", line 611, in main
setuptools.setup(
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/__init__.py", line 103, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/wheel/bdist_wheel.py", line 364, in run
self.run_command("build")
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-req-build-fgxtbhtl/setup.py", line 383, in run
ext._build_cmake(
File "/tmp/pip-req-build-fgxtbhtl/setup.py", line 355, in _build_cmake
raise RuntimeError(f"Error when running CMake: {e}")
RuntimeError: Error when running CMake: Command '['/tmp/pip-req-build-fgxtbhtl/.eggs/cmake-3.28.3-py3.11-linux-x86_64.egg/cmake/data/bin/cmake', '-S', '/tmp/pip-req-build-fgxtbhtl/transformer_engine', '-B', '/tmp/tmpfzxgbal5', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-fgxtbhtl/build/lib.linux-x86_64-cpython-311', '-Dpybind11_DIR=/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/pybind11/share/cmake/pybind11']' returned non-zero exit status 1.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for transformer-engine Running setup.py clean for transformer-engine Failed to build transformer-engine ERROR: Could not build wheels for transformer-engine, which is required to install pyproject.toml-based projects `
CMake is failing since it can't find your CUDA installation. You can reproduce this outside of TE by making a CMakeLists.txt file:
cmake_minimum_required(VERSION 3.18)
project(myproject LANGUAGES CUDA CXX)
Then call cmake . in the directory.
I'd recommend one of the following:
- Set the
CUDA_PATHenvironment variable with the path to the CUDA installation (something like/usr/local/cuda) - Add
nvccto yourPATH - Set the
CUDACXXenvironment variable with the path tonvcc
Related: https://github.com/NVIDIA/TransformerEngine/issues/383
I solved this issue by simply use this command
git submodule update --init --recursive
Under the TransformerEngine dir, I hope this might help you.
I was able to compile using CUDA/PyTorch 12.4 on Ubuntu 24.04. I was not able to compile with PyTorch 12.1 and CUDA 12.5. The docker image uses 12.2 for both, so I assume that works. 12.1 for both might work, but I didn't test it. These compilation errors are usually caused by version mismatch.
Check your PyTorch CUDA version:
python
import torch
torch.version.cuda
Check your cuda-toolkit version:
nvcc --version
You can grab PyTorch 12.4 from the preview here:
https://pytorch.org/get-started/locally/
CUDA Toolkit 12.4 here:
https://developer.nvidia.com/cuda-12-4-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local
Make sure to set MAX_JOBS to 1 before compiling (known flash attn issue):
export MAX_JOBS=1
Update your ~/.bashrc with environmental variables:
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export CUDA_PATH=/usr/local/cuda
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDACXX=/usr/local/cuda/bin/nvcc
export PATH=/usr/local/cuda/bin/nvcc:$PATH
Then run:
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
Compilation will take a while. Avoid installing with python setup.py install on the source. Install with git+ instead.
I fixed this bugs by add export PATH=/usr/local/cuda/bin:$PATH to .bashrc .
That cost me one afternoon.
For future reference, https://github.com/NVIDIA/TransformerEngine/issues/700#issuecomment-1979377899 provides instructions on installing CUDA so it is available to CMake.
I'll close this issue so this guidance is the last in the thread and is easier for other users to find. Please open a new issue if you run into another CMake issue.