DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

nvcc compile error reduction_utils.h(171) error: no operator "<" matches these operands FAILED: layer_norm.cuda.o

Open WXFMAV opened this issue 2 years ago • 0 comments

Is there anyone else meet such problem?

Single_gpu model with 1.3B model, the two previous steps: step1 and step2 are both successfully complete, but the step3 yields errors when nvcc is compiling layer_norm.o.

error: no operator "+" matches these operands +

the detailed error is :

FAILED: layer_norm.cuda.o 
/home/myusr/anaconda3/envs/deepspeed/bin/nvcc  -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/deepspeed/ops/csrc/includes -isystem /home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/torch/include -isystem /home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/torch/include/TH -isystem /home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/torch/include/THC -isystem /home/myusr/anaconda3/envs/deepspeed/include -isystem /home/myusr/anaconda3/envs/deepspeed/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/layer_norm.cu -o layer_norm.cuda.o 
/home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/deepspeed/ops/csrc/includes/reduction_utils.h(171): error: no operator "+" matches these operands
            operand types are: const __half + const __half
      return lhs + rhs;
                 ^
...

8 errors detected in the compilation of "/home/myusr/anaconda3/envs/deepspeed/lib/python3.7/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/layer_norm.cu".

...

The above exception was the direct cause of the following exception:
....
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'
...

Environment: nvcc:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

gcc:

gcc (GCC) 5.4.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

nvidia-smi

Sun Apr 23 18:05:58 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:4F:00.0 Off |                    0 |
| N/A   37C    P0    73W / 300W |   1277MiB / 32510MiB |      6%      Default |
|                               |                      |                  N/A |

WXFMAV avatar Apr 23 '23 10:04 WXFMAV