caffe2 icon indicating copy to clipboard operation
caffe2 copied to clipboard

Failed to build caffe2 on Centos

Open Luo-Liang opened this issue 7 years ago • 15 comments

Hi, I am building caffe2 from source, per instructions here https://caffe2.ai/docs/getting-started.html?platform=centos&configuration=cloud, and it has problem compiling gloo related stuff. Here is the output:

liangluo@n37:~/caffe2/build$ make [ 14%] Built target nccl_external [ 14%] Built target pthreadpool [ 14%] Built target nnpack_reference_layers [ 14%] Built target nnpack [ 14%] Built target gmock_main [ 14%] Built target gmock [ 14%] Built target gtest [ 14%] Built target gtest_main [ 14%] Built target benchmark [ 14%] Built target gloo [ 14%] Building NVCC (Device) object third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/nccl/gloo_cuda_generated_nccl.cu.o nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified CMake Error at gloo_cuda_generated_nccl.cu.o.Release.cmake:203 (message): Error generating /xxxxx/home/liangluo/caffe2/build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/nccl/./gloo_cuda_generated_nccl.cu.o

make[2]: *** [third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/nccl/gloo_cuda_generated_nccl.cu.o] Error 1 make[1]: *** [third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/all] Error 2 make: *** [all] Error 2

I'm using CUDA 8.

Any suggestion? Any more information required?

Thanks!

Luo-Liang avatar Oct 28 '17 20:10 Luo-Liang

have the same problem on centOS. It looks like they broke smthing recently

SlinkoIgor avatar Oct 31 '17 15:10 SlinkoIgor

Now I pull the code and I am getting when doing cmake3 .. in the build folder. CMake Error: INSTALL(EXPORT) given unknown export "GlooTargets"

Luo-Liang avatar Nov 03 '17 06:11 Luo-Liang

I add set(CMAKE_VERBOSE_MAKEFILE on) in CMakeList.txt for more debug info:

/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /opt/caffe2/third_party/gloo/gloo/nccl/nccl.cu -o /opt/caffe2/build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/nccl/gloo_cuda_generated_nccl.cu.o.NVCC-depend -ccbin /usr/bin/cc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=integer_sign_change -Xcudafe --diag_suppress=useless_using_declaration -Xcudafe --diag_suppress=set_but_not_used -std=c++11 -Xcompiler -fPIC --expt-relaxed-constexpr -Wno-deprecated-gpu-targets -DNVCC -I/usr/local/cuda/include -I/opt/caffe2/build/third_party/gloo -I/opt/caffe2/third_party/cub -I/opt/caffe2/third_party/nccl/build/include -I/usr/local/cuda/include -I/opt/caffe2/third_party/pybind11/include -I/opt/caffe2/third_party/eigen -I/usr/include -I/opt/caffe2/third_party/benchmark/include -I/opt/caffe2/third_party/googletest/googletest/include -I/opt/caffe2/third_party/protobuf/src /opt/caffe2/build/confu-srcs/pthreadpool/include /opt/caffe2/build/confu-srcs/fxdiv/include -I/opt/caffe2/third_party/protobuf/src /opt/caffe2/third_party/NNPACK/include /opt/caffe2/third_party/NNPACK/src /opt/caffe2/build/confu-srcs/pthreadpool/include /opt/caffe2/build/confu-srcs/fxdiv/include /opt/caffe2/build/confu-srcs/psimd/include /opt/caffe2/build/confu-srcs/fp16/include -I/opt/caffe2/third_party/protobuf/src -I/opt/anaconda3/include/python3.6m -I/opt/anaconda3/lib/python3.6/site-packages/numpy/core/include -I/opt/caffe2/third_party/gloo

In this nvcc command, many include dirs dont have the -I option prefix. This command works when I add -I before include dirs. So it seems to be a cmake bug.

yytdfc avatar Nov 29 '17 09:11 yytdfc

Update cmake (3.10) to the latest version will work.

yytdfc avatar Nov 30 '17 02:11 yytdfc

I'm using cmake3, the version is 3.6.1, the same error occured

huangynn avatar Dec 05 '17 13:12 huangynn

@huangynn ,pls follow @yytdfc idea , switch ti cmake_3.10 . well ,cmake<3.10 always occur error

gjpicker avatar Dec 05 '17 14:12 gjpicker

@gjpicker you got it right, all run smoothly when cmake 3.1 is used (OSX) cmake3 == cmake3.6 cmake = cmake 3.1 when use brew to install cmake

huangynn avatar Dec 06 '17 10:12 huangynn

I met same issue , how to use "cmake (3.10)" in CentOS to build Caffe2? I already updated 3.10 , but don't know what to use 3.10 for instead.

beanliao avatar Dec 08 '17 08:12 beanliao

cmake --version , if the version is 3.1, just apply the tutorial install steps

huangynn avatar Dec 09 '17 03:12 huangynn

Thank you, @huangynn ! I tried "cd build && cmake .." and then did "make install" , but there're another error when building, could you please advise? The error seems related to nccl , i didn't know how to make it right . i tried to install nccl from "/root/caffe2/third_party/nccl" , but it didn't work out.

[ 90%] Linking CXX executable ../bin/init_test [ 91%] Linking CXX executable ../bin/logging_test [ 91%] Linking CXX executable ../bin/common_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/cpuid_test] Error 1 make[1]: *** [caffe2/CMakeFiles/cpuid_test.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/timer_test] Error 1 make[1]: *** [caffe2/CMakeFiles/timer_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/fixed_divisor_test] Error 1 make[1]: *** [caffe2/CMakeFiles/fixed_divisor_test.dir/all] Error 2 [ 91%] Linking CXX executable ../bin/proto_utils_test ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/tutorial_blob] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/tutorial_blob.dir/all] Error 2 make[2]: *** [bin/convert_db] Error 1 make[2]: *** [bin/predictor_verifier] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/convert_db.dir/all] Error 2 make[1]: *** [caffe2/binaries/CMakeFiles/predictor_verifier.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/fatal_signal_asan_no_sig_test] Error 1 make[1]: *** [caffe2/CMakeFiles/fatal_signal_asan_no_sig_test.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/convert_caffe_image_db] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/convert_caffe_image_db.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/run_plan] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/run_plan.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/inspect_gpus] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/inspect_gpus.dir/all] Error 2 [ 91%] Linking CXX executable ../bin/event_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/init_test] Error 1 make[1]: *** [caffe2/CMakeFiles/init_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/common_test] Error 1 make[1]: *** [caffe2/CMakeFiles/common_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/logging_test] Error 1 make[1]: *** [caffe2/CMakeFiles/logging_test.dir/all] Error 2 [ 92%] Linking CXX executable ../bin/registry_test CMakeFiles/proto_utils_test.dir/utils/proto_utils_test.cc.o: In function caffe2::ProtoUtilsTest_SimpleReadWrite_Test::TestBody()': proto_utils_test.cc:(.text+0x31): warning: the use of tmpnam' is dangerous, better use mkstemp' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/proto_utils_test] Error 1 make[1]: *** [caffe2/CMakeFiles/proto_utils_test.dir/all] Error 2 [ 92%] Linking CXX executable ../bin/simple_queue_test [ 92%] Linking CXX executable ../bin/context_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/event_test] Error 1 make[1]: *** [caffe2/CMakeFiles/event_test.dir/all] Error 2 [ 92%] Linking CXX executable ../bin/conv_op_cache_cudnn_test [ 92%] Linking CXX executable ../../bin/db_throughput [ 92%] Linking CXX executable ../../bin/print_core_object_sizes [ 92%] Linking CXX executable ../bin/smart_tensor_printer_test [ 93%] Linking CXX executable ../../bin/print_registered_core_operators [ 93%] Linking CXX executable ../../bin/split_db [ 93%] Linking CXX executable ../../bin/make_mnist_db ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/registry_test] Error 1 make[1]: *** [caffe2/CMakeFiles/registry_test.dir/all] Error 2 [ 93%] Linking CXX executable ../../bin/make_cifar_db [ 93%] Linking CXX executable ../bin/event_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/simple_queue_test] Error 1 make[1]: *** [caffe2/CMakeFiles/simple_queue_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/conv_op_cache_cudnn_test] Error 1 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[1]: *** [caffe2/CMakeFiles/conv_op_cache_cudnn_test.dir/all] Error 2 make[2]: *** [bin/print_core_object_sizes] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/print_core_object_sizes.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/db_throughput] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/db_throughput.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/split_db] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/split_db.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/make_mnist_db] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/make_mnist_db.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/context_test] Error 1 make[1]: *** [caffe2/CMakeFiles/context_test.dir/all] Error 2 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/print_registered_core_operators] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/print_registered_core_operators.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/smart_tensor_printer_test] Error 1 ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[1]: *** [caffe2/CMakeFiles/smart_tensor_printer_test.dir/all] Error 2 make[2]: *** [bin/make_cifar_db] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/make_cifar_db.dir/all] Error 2 [ 93%] Linking CXX executable ../bin/stats_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/event_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/event_gpu_test.dir/all] Error 2 [ 93%] Linking CXX executable ../../bin/speed_benchmark ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status [ 94%] Linking CXX executable ../bin/typeid_test make[2]: *** [bin/stats_test] Error 1 make[1]: *** [caffe2/CMakeFiles/stats_test.dir/all] Error 2 [ 95%] Linking CXX executable ../bin/conv_to_nnpack_transform_test [ 95%] Linking CXX executable ../bin/common_subexpression_elimination_test [ 95%] Linking CXX executable ../bin/operator_gpu_test [ 95%] Linking CXX executable ../bin/boolean_unmask_ops_test [ 95%] Linking CXX executable ../bin/module_test ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/speed_benchmark] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/speed_benchmark.dir/all] Error 2 [ 95%] Linking CXX executable ../bin/text_file_reader_utils_test [ 96%] Linking CXX executable ../bin/context_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/typeid_test] Error 1 make[1]: *** [caffe2/CMakeFiles/typeid_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/conv_to_nnpack_transform_test] Error 1 make[1]: *** [caffe2/CMakeFiles/conv_to_nnpack_transform_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/common_subexpression_elimination_test] Error 1 make[1]: *** [caffe2/CMakeFiles/common_subexpression_elimination_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/operator_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/operator_gpu_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/boolean_unmask_ops_test] Error 1 make[1]: *** [caffe2/CMakeFiles/boolean_unmask_ops_test.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/workspace_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/module_test] Error 1 make[1]: *** [caffe2/CMakeFiles/module_test.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/parallel_net_test CMakeFiles/text_file_reader_utils_test.dir/operators/text_file_reader_utils_test.cc.o: In function caffe2::TextFileReaderUtilsTest_TokenizeTest_Test::TestBody()': text_file_reader_utils_test.cc:(.text+0x1e03): warning: the use of tmpnam' is dangerous, better use mkstemp' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/text_file_reader_utils_test] Error 1 make[1]: *** [caffe2/CMakeFiles/text_file_reader_utils_test.dir/all] Error 2 [ 97%] Linking CXX executable ../../bin/core_overhead_benchmark ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/context_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/context_gpu_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/workspace_test] Error 1 make[1]: *** [caffe2/CMakeFiles/workspace_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/parallel_net_test] Error 1 make[1]: *** [caffe2/CMakeFiles/parallel_net_test.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/graph_test [ 97%] Linking CXX executable ../bin/observer_test ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/core_overhead_benchmark] Error 1 make[1]: *** [caffe2/binaries/CMakeFiles/core_overhead_benchmark.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/operator_schema_test [ 97%] Linking CXX executable ../bin/net_test [ 97%] Linking CXX executable ../bin/math_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/graph_test] Error 1 make[1]: *** [caffe2/CMakeFiles/graph_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/observer_test] Error 1 make[1]: *** [caffe2/CMakeFiles/observer_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/net_test] Error 1 make[1]: *** [caffe2/CMakeFiles/net_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/operator_schema_test] Error 1 make[1]: *** [caffe2/CMakeFiles/operator_schema_test.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/fully_connected_op_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/math_test] Error 1 make[1]: *** [caffe2/CMakeFiles/math_test.dir/all] Error 2 [ 97%] Linking CXX executable ../bin/utility_ops_test [ 97%] Linking CXX executable ../bin/reshape_op_gpu_test [ 97%] Linking CXX executable ../bin/operator_fallback_gpu_test [ 98%] Linking CXX executable ../bin/predictor_test [ 98%] Linking CXX executable ../bin/conv_transpose_op_mobile_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/fully_connected_op_test] Error 1 make[1]: *** [caffe2/CMakeFiles/fully_connected_op_test.dir/all] Error 2 [ 98%] Linking CXX executable ../bin/string_ops_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/utility_ops_test] Error 1 make[1]: *** [caffe2/CMakeFiles/utility_ops_test.dir/all] Error 2 [ 98%] Linking CXX executable ../bin/transform_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/reshape_op_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/reshape_op_gpu_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/fully_connected_op_gpu_test [ 99%] Linking CXX executable ../bin/utility_ops_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/operator_fallback_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/operator_fallback_gpu_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/elementwise_op_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/predictor_test] Error 1 make[1]: *** [caffe2/CMakeFiles/predictor_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/conv_transpose_op_mobile_test] Error 1 make[1]: *** [caffe2/CMakeFiles/conv_transpose_op_mobile_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/elementwise_op_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/string_ops_test] Error 1 make[1]: *** [caffe2/CMakeFiles/string_ops_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/fully_connected_op_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/fully_connected_op_gpu_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/transform_test] Error 1 make[1]: *** [caffe2/CMakeFiles/transform_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/utility_ops_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/utility_ops_gpu_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/math_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/elementwise_op_test] Error 1 make[1]: *** [caffe2/CMakeFiles/elementwise_op_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/elementwise_op_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/elementwise_op_gpu_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/pattern_net_transform_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/math_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/math_gpu_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/pattern_net_transform_test] Error 1 make[1]: *** [caffe2/CMakeFiles/pattern_net_transform_test.dir/all] Error 2 [ 99%] Linking CXX executable ../bin/operator_test [ 99%] Linking CXX executable ../bin/blob_gpu_test ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/operator_test] Error 1 make[1]: *** [caffe2/CMakeFiles/operator_test.dir/all] Error 2 ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/blob_gpu_test] Error 1 make[1]: *** [caffe2/CMakeFiles/blob_gpu_test.dir/all] Error 2 [100%] Linking CXX executable ../bin/blob_test CMakeFiles/blob_test.dir/core/blob_test.cc.o: In function caffe2::(anonymous namespace)::TypedTensorTest_BigTensorSerialization_Test<float>::TestBody()': blob_test.cc:(.text+0x441b1): warning: the use of tmpnam' is dangerous, better use mkstemp' ../lib/libcaffe2_gpu.so: undefined reference to ncclGroupStart' ../lib/libcaffe2_gpu.so: undefined reference to `ncclGroupEnd' collect2: error: ld returned 1 exit status make[2]: *** [bin/blob_test] Error 1 make[1]: *** [caffe2/CMakeFiles/blob_test.dir/all] Error 2 [100%] Linking CXX shared module python/caffe2_pybind11_state_gpu.so [100%] Built target caffe2_pybind11_state_gpu [100%] Linking CXX shared module python/caffe2_pybind11_state.so [100%] Built target caffe2_pybind11_state make: *** [all] Error 2

beanliao avatar Dec 10 '17 08:12 beanliao

@beanliao Your issue is unrelated to this one (as discussed in #1601)

pietern avatar Dec 11 '17 21:12 pietern

@yytdfc Thanks for your investigation. I can't repro the issue and we're trying to figure it out in facebookincubator/gloo#100. I'll take a look at the CMake diffs between 3.6 and 3.10 for the CUDA support.

pietern avatar Dec 12 '17 04:12 pietern

I found a repro. I needed to explicitly enable NNPACK support. I found this by looking more closely at the nvcc command when the error happens, which indicates something is going on with the NNPACK includes that is incompatible with nvcc.

Now looking for a fix (other than upgrading CMake).

pietern avatar Dec 12 '17 19:12 pietern

NNPACK uses a CMake generator expression for its include dirs ($<TARGET_PROPERTY:nnpack,INCLUDE_DIRECTORIES>) and this apparently doesn't expand into the real list (which contains > 1 directory) when the -I arguments are expanded. After the argument list is built, it is expanded after all, and we end up with paths without -I in the argument list to nvcc.

pietern avatar Dec 12 '17 21:12 pietern

This was addressed in Kitware/CMake@7ded655f7ba82ea72a82d0555449f2df5ef38594 was first included in CMake 3.7.

pietern avatar Dec 12 '17 21:12 pietern