pointnet2 icon indicating copy to clipboard operation
pointnet2 copied to clipboard

Compilation error in tf_sampling_compile.sh

Open nuosferatu opened this issue 5 years ago • 7 comments

I tried to compile tf_sampling_compile.sh, a compilation error occurred when I run it:

$ ./tf_sampling_compile.sh 
tf_sampling.cpp: In lambda function:
tf_sampling.cpp:20:40: warning: ignoring return value of 'tensorflow::Status tensorflow::shape_inference::InferenceContext::WithRank(tensorflow::shape_inference::ShapeHandle, tensorflow::int64, tensorflow::shape_inference::ShapeHandle*)', declared with attribute warn_unused_result [-Wunused-result]
     c->WithRank(c->input(0), 2, &dims1);
                                        ^
In file included from tf_sampling.cpp:8:0:
/home/coder/anaconda3/envs/wyh-spfn/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/shape_inference.h:394:10: note: declared here
   Status WithRank(ShapeHandle shape, int64 rank,
          ^~~~~~~~
tf_sampling.cpp:22:40: warning: ignoring return value of 'tensorflow::Status tensorflow::shape_inference::InferenceContext::WithRank(tensorflow::shape_inference::ShapeHandle, tensorflow::int64, tensorflow::shape_inference::ShapeHandle*)', declared with attribute warn_unused_result [-Wunused-result]
     c->WithRank(c->input(1), 2, &dims2);
                                        ^
In file included from tf_sampling.cpp:8:0:
/home/coder/anaconda3/envs/wyh-spfn/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/shape_inference.h:394:10: note: declared here
   Status WithRank(ShapeHandle shape, int64 rank,
          ^~~~~~~~
tf_sampling.cpp: In lambda function:
tf_sampling.cpp:34:40: warning: ignoring return value of 'tensorflow::Status tensorflow::shape_inference::InferenceContext::WithRank(tensorflow::shape_inference::ShapeHandle, tensorflow::int64, tensorflow::shape_inference::ShapeHandle*)', declared with attribute warn_unused_result [-Wunused-result]
     c->WithRank(c->input(0), 3, &dims1);
                                        ^
In file included from tf_sampling.cpp:8:0:
/home/coder/anaconda3/envs/wyh-spfn/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/shape_inference.h:394:10: note: declared here
   Status WithRank(ShapeHandle shape, int64 rank,
          ^~~~~~~~
tf_sampling.cpp: In lambda function:
tf_sampling.cpp:47:40: warning: ignoring return value of 'tensorflow::Status tensorflow::shape_inference::InferenceContext::WithRank(tensorflow::shape_inference::ShapeHandle, tensorflow::int64, tensorflow::shape_inference::ShapeHandle*)', declared with attribute warn_unused_result [-Wunused-result]
     c->WithRank(c->input(0), 3, &dims1);
                                        ^
In file included from tf_sampling.cpp:8:0:
/home/coder/anaconda3/envs/wyh-spfn/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/shape_inference.h:394:10: note: declared here
   Status WithRank(ShapeHandle shape, int64 rank,
          ^~~~~~~~
tf_sampling.cpp:49:40: warning: ignoring return value of 'tensorflow::Status tensorflow::shape_inference::InferenceContext::WithRank(tensorflow::shape_inference::ShapeHandle, tensorflow::int64, tensorflow::shape_inference::ShapeHandle*)', declared with attribute warn_unused_result [-Wunused-result]
     c->WithRank(c->input(1), 2, &dims2);
                                        ^
In file included from tf_sampling.cpp:8:0:
/home/coder/anaconda3/envs/wyh-spfn/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/framework/shape_inference.h:394:10: note: declared here
   Status WithRank(ShapeHandle shape, int64 rank,
          ^~~~~~~

The details of the two bash files:

# tf_sampling_compile.sh
#!/usr/bin/env bash
source ../config.sh

$nvcc_bin tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC

# TF1.4
g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC \
  -lcudart \
  -I $cuda_include_dir \
  -I $tensorflow_include_dir \
  -I $tensorflow_external_dir \
  -L $cuda_library_dir \
  -L $tensorflow_library_dir \
  -ltensorflow_framework -O2
# config.sh
#!/usr/bin/env bash

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

nvcc_bin=/usr/local/cuda-10.0/bin/nvcc

cuda_include_dir=/usr/local/cuda-10.0/include
tensorflow_include_dir=$TF_INC
tensorflow_external_dir=$TF_INC/external/nsync/public

cuda_library_dir=/usr/local/cuda-10.0/lib64/
tensorflow_library_dir=$TF_LIB

My environment is a docker container, I installed tensorflow 1.10.0, cudatoolkit 8.0, cudnn 7.1.3 with conda command in a virtual environment. I modified paths of tensorflow and cuda, while the global cuda driver is 10.0, should I concern about the different versions between cuda driver and cudatoolkit?

nuosferatu avatar Oct 09 '19 11:10 nuosferatu

Hi, my Ubuntu server is shared in the lab and the CUDA is only 10.1. Can you give me a link or instruction on how can I use virtual env of CUDA 8 or 9 on it. Because CUDA is hardware related, so I wonder how can I do that, can you give me some details?

AmmonZ avatar Oct 11 '19 05:10 AmmonZ

I'm in the similar situation as you, CUDA 10.0 is in (base) env in my docker container, well, I installed cudatoolkit 8.0 in (my-virtual) env created by conda, I didn't find out if the diffs had any effect. Now I try to compile it again on host os directly, wish me luck.

nuosferatu avatar Oct 11 '19 06:10 nuosferatu

I want to share my situation here, hope it can give help.

In my Ubuntu workstation, I installed tensorflow-gpu by: conda install -c anaconda tensorflow-gpu in my conda env, the version of tensorflow-gpu is v1.14.0, which is different with TF1.2 GPU version the author mentioned in README.md.

I can not compile tf_ops correctly by using the tf_xxx_compile.sh provided in the project, some symbol resolve related error logs always displayed. I think may be there are some version compatibility problems in the compile sh files. So I tried to follow the Build the op library guide in tensorflow website and compiled all three ops successfully. For example, my final sampling op compile file is :

TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )
TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )

nvcc -std=c++11 -c -o tf_sampling_g.cu.o tf_sampling_g.cu \
  ${TF_CFLAGS[@]} -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC

g++ -std=c++11 -shared -o tf_sampling_so.so tf_sampling.cpp \
  tf_sampling_g.cu.o ${TF_CFLAGS[@]} -fPIC -lcudart ${TF_LFLAGS[@]}

hi-zhengcheng avatar Oct 21 '19 20:10 hi-zhengcheng

I solved this problem by changing versions.

The version correspondance is very important. I installed tensorflow-gpu 1.12 (<1.13) by conda so that the correspondant CUDA has to be 9.0. If your tensorflow-gpu is 1.14, you need to upgrade CUDA to 10.0.

Version Python version Compiler Build tools cuDNN CUDA
tensorflow-2.0.0 2.7, 3.3-3.7 GCC 7.3.1 Bazel 0.26.1 7.4 10.0
tensorflow_gpu-1.14.0 2.7, 3.3-3.7 GCC 4.8 Bazel 0.24.1 7.4 10.0
tensorflow_gpu-1.13.1 2.7, 3.3-3.7 GCC 4.8 Bazel 0.19.2 7.4 10.0
tensorflow_gpu-1.12.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.11.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.10.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.15.0 7 9
tensorflow_gpu-1.9.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.11.0 7 9
tensorflow_gpu-1.8.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.10.0 7 9
tensorflow_gpu-1.7.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.9.0 7 9
tensorflow_gpu-1.6.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.9.0 7 9
tensorflow_gpu-1.5.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.8.0 7 9
tensorflow_gpu-1.4.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.5.4 6 8
tensorflow_gpu-1.3.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.4.5 6 8
tensorflow_gpu-1.2.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.4.5 5.1 8
tensorflow_gpu-1.1.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.4.2 5.1 8
tensorflow_gpu-1.0.0 2.7, 3.3-3.6 GCC 4.8 Bazel 0.4.2 5.1 8

PS: The version of Nvidia driver is also related to this problem.

nuosferatu avatar Oct 22 '19 02:10 nuosferatu

I meet the same problem. Have you solved it?

wanyiming2017 avatar Nov 02 '19 15:11 wanyiming2017

Yes I solved this problem. The key is just versions of both GPU driver and libraries, You can follow the table in my reply above, select correct version from GPU to CUDA and cuDNN, as well as other libraries installed by conda. By the way, version of Python, 3.6 or 3.7, is also need you to be careful.

发自我的iPhone

在 2019年11月2日,23:03,wanyiming2017 [email protected] 写道:

 I meet the same problem. Have you solved it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

nuosferatu avatar Nov 02 '19 15:11 nuosferatu

I am using windows, python 3.9, tensorflow 2.4, cuda 10.1 and gcc 9.3. I am also not able to compile the tf ops. Please someone help.

pratibhashinde avatar Feb 10 '22 11:02 pratibhashinde