benchmarks icon indicating copy to clipboard operation
benchmarks copied to clipboard

Compiling options for tensorflow

Open chenhuan0 opened this issue 7 years ago • 1 comments

Hi, I followed this article and reproduce the throughput that it made. However, when I try to compile the tensorflow by myself, I cannot achieve the throughput that article did.

I wonder what compiler version and options does this prebuild package use?

I tried to pass the bazel "-march=native -O3", it improved but still less than using your package.

Look forward to your reply.

Thanks

chenhuan0 avatar Nov 28 '18 09:11 chenhuan0

Here is what I used:

Compile: I do it manually so I just answer the questions. All defaults except do the following:

  • CUDA 10 and cuDNN 7.3.1 (I have seen some regression with cuDNN 7.4 that are fixed at head and I am testing today that could improve performance by another 10% maybe)
  • XLA yes (default in TF 1.12)
  • NCCL 2.3.5
  • you can include TensorRT but it doesn't matter for the ResNet test
  • compute 7.0 (or whatever you need/want)
# I build with haswell which gives AVX2 support and I am 
# too lazy to ensure I type out all of the various flags I want.
# use I think ivybridge if you want AVX.  If your GCC is older
# it may not support the haswell alias.
bazel build -c opt --copt=-march="broadwell" //tensorflow/tools/pip_package:build_pip_package
# Make the .whl
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

tfboyd avatar Nov 28 '18 18:11 tfboyd