tensorflow-allreduce icon indicating copy to clipboard operation
tensorflow-allreduce copied to clipboard

Dependency declaration issue in mpi.so build

Open aburden5 opened this issue 8 years ago • 1 comments

Hi, I am new to bazel, attempting to build TensorFlow-AllReduce w/o CUDA from source.

(git rev-parse HEAD: 44302f0961e76002989523c2f424a011e54fd806)

I am using Bazel 0.4.5, but have also gotten same error when using 0.4.3. I am in RHEL 4.8.5-4.

when I try "bazel build -c opt :mpi.so" I get the following error:

ERROR: /home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/contrib/mpi/BUILD:13:1: undeclared inclusion(s) in rule '//tensorflow/contrib/mpi:mpi.so': this rule is missing dependency declarations for the following files included by 'tensorflow/contrib/mpi/mpi_ops.cc': '/home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/stream_executor/lib/statusor.h' '/home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/stream_executor/platform/port.h' '/home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/stream_executor/lib/error.h' '/home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/stream_executor/lib/status.h' '/home_nfs/aburdenx/apps/tensorflow-allreduce/tensorflow/stream_executor/platform/logging.h'.

I have tried several fixes, but am not having any success, including:

-> Tried adding a dependency declaration in tensorflow-allreduce/tensorflow/contrib/mpi/BUILD, as follows:

tf_custom_op_library( name = "mpi.so", srcs = ["mpi_ops.cc", "ring.cc", "ring.h"], gpu_srcs = ["ring.cu.cc", "ring.h"], deps = [ "//third_party/mpi:mpi", ":mpi_message_proto_cc", "//tensorflow/stream_executor:stream_executor", ], )

but I get this error:

tensorflow/stream_executor:stream_executor cannot depend on tensorflow/core:lib.

-> Tried adding --copt="-Itensorflow/stream_executor/lib" to bazel build with no luck

The "undeclared inclusion(s)" error seems to crop up mostly in CUDA builds due to missing path to cuda include dir, but this is clearly different issue that I haven't seen solutions for.

I will greatly appreciate any help moving forward.

Thanks!

aburden5 avatar May 05 '17 01:05 aburden5

Maybe ,you should remove the "tensorflow/stream_executor:stream_executor" and add "--config=cuda" to your bazeel build opt.: bazel build -c opt --config=cuda tensorflow/tools/pip_package:build_pip_package

rhdong avatar Aug 12 '17 11:08 rhdong