DA-Transformer Compiled Failed

python 3.7.12
pytorch 1.11.0+cu102
gcc 5.4

I have modified the cloneable.h file according to the FAQs section, but I still encounter the following error when the program is running. Please tell me how can i fix it?

 
Traceback (most recent call last):  
File /home/env/nat/lib/python3.7/site-packages/torch/utils/cpp_extension.py, line 1746, in _run_ninja_build   env=env)
File /home/env/nat/lib/python3.7/subprocess.py, line 512, in run   output=stdout, stderr=stderr)  subprocess.CalledProcessError: Command [ninja, -v] returned non-zero exit status 1.
The above exception was the direct cause of the following exception:

RuntimeError: Error building extension 'dag_loss_fn': [1/2] /usr/local/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=dag_loss_fn -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/env/nat/lib/python3.7/site-packages/torch/include -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/TH -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/env/nat/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -DOF_SOFTMAX_USE_FAST_MATH -std=c++14 -c /home/DA-Transformer/fs_plugins/custom_ops/logsoftmax_gather.cu -o logsoftmax_gather.cuda.o 
FAILED: logsoftmax_gather.cuda.o 

/usr/local/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=dag_loss_fn -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/env/nat/lib/python3.7/site-packages/torch/include -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/TH -isystem /home/env/nat/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/env/nat/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -DOF_SOFTMAX_USE_FAST_MATH -std=c++14 -c /home/DA-Transformer/fs_plugins/custom_ops/logsoftmax_gather.cu -o logsoftmax_gather.cuda.o 
/home/DA-Transformer/fs_plugins/custom_ops/logsoftmax_gather.cu:31:23: fatal error: cub/cub.cuh: No such file or directory compilation terminated.
ninja: build stopped: subcommand failed

Jul 21 '22 10:07 jchang98

It seems that PyTorch1.11 removes cub from the default including directory. A direct workaround is using Pytorch1.10.

I am trying to include cub in pytorch1.11 and will update this issue if I find an solution.

Jul 21 '22 10:07 hzhwcmhf

It seems that PyTorch1.11 removes cub from the default including directory. A direct workaround is using Pytorch1.10.

I am trying to include cub in pytorch1.11 and will update this issue if I find an solution.

I try to reinstall pytorch1.10.1, but it doesn't work (T⌓T)

Jul 21 '22 11:07 jchang98

I am trying to reproduce your environment... it may take some times before I can find a solution.

If possible, you can also try using cuda>=11.0. Or just skip cuda compiling by adding the following arguments:

--torch-dag-loss                  # Use torch implementation for dag loss instead cuda implementation. It may become slower and consume more memory.
--torch-dag-best-alignment        # Use torch implementation for best-alignment instead cuda implementation. It may become slower and consume more memory.
--torch-dag-logsoftmax-gather     # Use torch implementation for logsoftmax-gather instead cuda implementation. It may become slower and consume more memory.

Jul 21 '22 11:07 hzhwcmhf

I am trying to reproduce your environment... it may take some times before I can find a solution.

If possible, you can also try using cuda>=11.0. Or just skip cuda compiling by adding the following arguments:
--torch-dag-loss                  # Use torch implementation for dag loss instead cuda implementation. It may become slower and consume more memory.
--torch-dag-best-alignment        # Use torch implementation for best-alignment instead cuda implementation. It may become slower and consume more memory.
--torch-dag-logsoftmax-gather     # Use torch implementation for logsoftmax-gather instead cuda implementation. It may become slower and consume more memory.

Okay, thanks!

Jul 21 '22 12:07 jchang98

@jchang98 I have pushed an update which manuanlly includes the cub library. Please re-clone this repo and try again.

Jul 21 '22 13:07 hzhwcmhf

@sudanl Can you run the script with only one GPU (single process)? It only says that the Cuda program is not correctly compiled but not shows the real errors.

Oct 17 '22 12:10 hzhwcmhf

DA-Transformer DA-Transformer copied to clipboard

Compiled Failed

DA-Transformer
DA-Transformer copied to clipboard