DeepSpeed Failing to build with DS_BUILD_OPS=1 due to missing nccl.h file

Hi, I'm having troubles installing deepspeed with additional flags. When I run

export NCCL_HOME=$CONDA_PREFIX/lib/python3.11/site-packages/nvidia/nccl
DS_BUILD_OPS=1 DS_BUILD_TRANSFORMER_INFERENCE=1 pip install --force-reinstall deepspeed --no-cache  --no-deps

I get the following error:

 building 'deepspeed.ops.dc_op' extension
      creating build/temp.linux-x86_64-cpython-311/csrc/compile
      /home/conda/envs/petrov_dpo/bin/x86_64-conda-linux-gnu-c++ -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/conda/envs/petrov_dpo/include -I/home/conda/envs/petrov_dpo/targets/x86_64-linux/include -L/home/conda/envs/petrov_dpo/targets/x86_64-linux/lib -L/home/conda/envs/petrov_dpo/targets/x86_64-linux/lib/stubs -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/conda/envs/petrov_dpo/include -I/home/conda/envs/petrov_dpo/targets/x86_64-linux/include -L/home/conda/envs/petrov_dpo/targets/x86_64-linux/lib -L/home/conda/envs/petrov_dpo/targets/x86_64-linux/lib/stubs -fPIC -I/tmp/pip-install-xj439mbp/deepspeed_e6e2eefe53f349b98db99e60376d7866/csrc/includes -I/tmp/pip-install-xj439mbp/deepspeed_e6e2eefe53f349b98db99e60376d7866/csrc/compile -I/home/conda/envs/petrov_dpo/include -I/home/conda/envs/petrov_dpo/lib/python3.11/site-packages/torch/include -I/home/conda/envs/petrov_dpo/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/envs/petrov_dpo/include -I/home/conda/envs/petrov_dpo/include/python3.11 -c csrc/compile/deepcompile.cpp -o build/temp.linux-x86_64-cpython-311/csrc/compile/deepcompile.o -O3 -std=c++17 -g -Wno-reorder -L/home/conda/envs/petrov_dpo/lib -lcudart -lcublas -g -march=native -fopenmp -D__AVX512__ -D__ENABLE_CUDA__ -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1016\" -DTORCH_EXTENSION_NAME=dc_op -D_GLIBCXX_USE_CXX11_ABI=1
      In file included from /tmp/pip-install-xj439mbp/deepspeed_e6e2eefe53f349b98db99e60376d7866/csrc/includes/deepcompile.h:20,
                       from csrc/compile/deepcompile.cpp:6:
      /home/conda/envs/petrov_dpo/lib/python3.11/site-packages/torch/include/torch/csrc/distributed/c10d/NCCLUtils.hpp:15:10: fatal error: nccl.h: No such file or directory
         15 | #include <nccl.h>
            |          ^~~~~~~~
      compilation terminated.
      error: command '/home/conda/envs/petrov_dpo/bin/x86_64-conda-linux-gnu-c++' failed with exit code 1

Is there a way to solve it? I'm using cudann installed via conda and thus I have non standard folders for cuda, nvcc and other libraries.

Aug 06 '25 13:08 Sirorezka

Same issues here. Maybe need a way to specify additional nccl path instead of use cuda path directly. https://github.com/deepspeedai/DeepSpeed/blob/047a7599d24622dfb37fa5e5a32c671b1bb44233/op_builder/dc.py#L40

For example, check NCCL_INCLUDE_PATH

Aug 28 '25 06:08 npuichigo

Thanks for the answer. Sorry I have switched from the topic thus not sure that will have time to check your solution, but maybe this question will help smn else. Many thanks.

Sep 09 '25 14:09 Sirorezka

FYI I made the following patch and applied it to 0.17.6

git apply <<'PATCH'
diff --git a/op_builder/dc.py b/op_builder/dc.py
index 15b25bf3..bce4e97d 100644
--- a/op_builder/dc.py
+++ b/op_builder/dc.py
@@ -33,6 +33,10 @@ class DeepCompileBuilder(TorchCPUOpBuilder):
             CUDA_INCLUDE = []
         elif not self.is_rocm_pytorch():
             CUDA_INCLUDE = [os.path.join(torch.utils.cpp_extension.CUDA_HOME, "include")]
+            # If set, append a single NCCL include dir.
+            _nccl_inc = os.environ.get("NCCL_INCLUDE_DIR")
+            if _nccl_inc and _nccl_inc not in CUDA_INCLUDE:
+                CUDA_INCLUDE.append(_nccl_inc)
         else:
             CUDA_INCLUDE = [
                 os.path.join(torch.utils.cpp_extension.ROCM_HOME, "include"),
PATCH

Setting and exportingNCCL_INCLUDE_DIR beforehand, of course

Oct 01 '25 03:10 felker