crfrnn_layer icon indicating copy to clipboard operation
crfrnn_layer copied to clipboard

Code not working properly in Pytorch 1.7.0, 1.7.1

Open heng-yuwen opened this issue 4 years ago • 3 comments

Hi, I tried to run your code in Pytorch 1.7.0, 1.7.1. The sample code you give cannot work properly. This maybe a Pytorch bug due to multiple --ccbin flag.

error detail:

running install
running bdist_egg
running egg_info
writing permutohedral.egg-info/PKG-INFO
writing dependency_links to permutohedral.egg-info/dependency_links.txt
writing top-level names to permutohedral.egg-info/top_level.txt
reading manifest file 'permutohedral.egg-info/SOURCES.txt'
writing manifest file 'permutohedral.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'permutohedral_ext' extension
Emitting ninja build file /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/4] /usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/gfilt_wrapper.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_wrapper.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
FAILED: /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_wrapper.o
/usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/gfilt_wrapper.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_wrapper.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
nvcc fatal   : redefinition of argument 'compiler-bindir'
[2/4] /usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/gfilt_cuda.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
FAILED: /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_cuda.o
/usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/gfilt_cuda.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/gfilt_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
nvcc fatal   : redefinition of argument 'compiler-bindir'
[3/4] /usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/build_hash.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
FAILED: /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash.o
/usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/build_hash.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
nvcc fatal   : redefinition of argument 'compiler-bindir'
[4/4] /usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/build_hash_wrapper.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash_wrapper.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
FAILED: /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash_wrapper.o
/usr/local/cuda-10.1/bin/nvcc -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/TH -I/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/yh1n19/miniconda3/envs/cv/include/python3.8 -c -c /home/yh1n19/test/crfrnn_layer/src/build_hash_wrapper.cu -o /home/yh1n19/test/crfrnn_layer/build/temp.linux-x86_64-3.8/src/build_hash_wrapper.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14
nvcc fatal   : redefinition of argument 'compiler-bindir'
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/yh1n19/miniconda3/envs/cv/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1516, in _run_ninja_build
    subprocess.run(
  File "/home/yh1n19/miniconda3/envs/cv/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

There are two -ccbin arguments in the logs. This may cause the error..

 -ccbin=/home/yh1n19/miniconda3/envs/cv/bin -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=permutohedral_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin /home/yh1n19/miniconda3/envs/cv/bin/x86_64-conda_cos6-linux-gnu-cc -std=c++14 

After uninstalling ninja, and update to pytorch 1.8.0, the compile is fine, but when run your sample, by adding the parameters required by CRF initialiser, (n_ref=3, n_out=numClass, dev=dev), I still got the following error:

Unsupported ref_dim/val_dim combination (5, 16), generate a new dispatch table using 'make_gfilt_dispatch.py'

heng-yuwen avatar Mar 14 '21 17:03 heng-yuwen

Is this repo not fully updated for pytorch?

heng-yuwen avatar Mar 14 '21 18:03 heng-yuwen

Yes this repo isn't really actively maintained, in particular I haven't tested the CRF part in a very long time. I think setup.py needs an update probably.

HapeMask avatar Mar 30 '21 20:03 HapeMask

It works fine in Pytorch 1.10.0+cu102.

Regarding the error

Unsupported ref_dim/val_dim combination (5, 16), generate a new dispatch table using 'make_gfilt_dispatch.py'

you have to do what it says, generate the dispatch table and replace the file gfilt_dispatch_table.h and recompile/reinstall everything.

For example for Cityscapes you have 19 classes, the default table that is created when you run python setup.py install supports up to 16 classes, so this error occurs and you have to manually extend it via script 'make_gfilt_dispatch.py'.

lukaszbinden avatar Sep 01 '22 09:09 lukaszbinden