serving
serving copied to clipboard
issues with building tfx r2.8 from source with mkl support
Describe the problem the feature is intended to solve
When building tfx r2.8-rc0 with mkl support, I see the following issue:
ERROR: /root/.cache/bazel/_bazel_root/c206fe4b7a49887ed31d86472abc6776/external/org_tensorflow/tensorflow/core/common_runtime/BUILD:1739:11: Couldn't build file external/org_tensorflow/tensorflow/core/common_runtime/_objs/threadpool_device/threadpool_device.o: C++ compilation of rule '@org_tensorflow//tensorflow/core/common_runtime:threadpool_device' failed (Exit 1): gcc failed: error executing command
(cd /root/.cache/bazel/_bazel_root/c206fe4b7a49887ed31d86472abc6776/execroot/tf_serving && \
exec env - \
LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH' \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
PWD=/proc/self/cwd \
/usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++11' -MD -MF bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/common_runtime/_objs/threadpool_device/threadpool_device.d '-frandom-seed=bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/common_runtime/_objs/threadpool_device/threadpool_device.o' -DTF_USE_SNAPPY -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' -DHAVE_SYS_UIO_H -iquoteexternal/org_tensorflow -iquotebazel-out/k8-opt/bin/external/org_tensorflow -iquoteexternal/com_google_absl -iquotebazel-out/k8-opt/bin/external/com_google_absl -iquoteexternal/nsync -iquotebazel-out/k8-opt/bin/external/nsync -iquoteexternal/eigen_archive -iquotebazel-out/k8-opt/bin/external/eigen_archive -iquoteexternal/gif -iquotebazel-out/k8-opt/bin/external/gif -iquoteexternal/libjpeg_turbo -iquotebazel-out/k8-opt/bin/external/libjpeg_turbo -iquoteexternal/com_google_protobuf -iquotebazel-out/k8-opt/bin/external/com_google_protobuf -iquoteexternal/zlib -iquotebazel-out/k8-opt/bin/external/zlib -iquoteexternal/com_googlesource_code_re2 -iquotebazel-out/k8-opt/bin/external/com_googlesource_code_re2 -iquoteexternal/farmhash_archive -iquotebazel-out/k8-opt/bin/external/farmhash_archive -iquoteexternal/fft2d -iquotebazel-out/k8-opt/bin/external/fft2d -iquoteexternal/highwayhash -iquotebazel-out/k8-opt/bin/external/highwayhash -iquoteexternal/double_conversion -iquotebazel-out/k8-opt/bin/external/double_conversion -iquoteexternal/snappy -iquotebazel-out/k8-opt/bin/external/snappy -isystem external/nsync/public -isystem bazel-out/k8-opt/bin/external/nsync/public -isystem external/org_tensorflow/third_party/eigen3/mkl_include -isystem bazel-out/k8-opt/bin/external/org_tensorflow/third_party/eigen3/mkl_include -isystem external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/k8-opt/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/k8-opt/bin/external/com_google_protobuf/src -isystem external/zlib -isystem bazel-out/k8-opt/bin/external/zlib -isystem external/farmhash_archive/src -isystem bazel-out/k8-opt/bin/external/farmhash_archive/src -isystem external/double_conversion -isystem bazel-out/k8-opt/bin/external/double_conversion -mavx -msse4.2 '-std=c++14' '-D_GLIBCXX_USE_CXX11_ABI=0' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions -DINTEL_MKL -DENABLE_MKL -DENABLE_ONEDNN_OPENMP -msse3 -DTENSORFLOW_MONOLITHIC_BUILD -pthread -fopenmp -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/org_tensorflow/tensorflow/core/common_runtime/threadpool_device.cc -o bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/common_runtime/_objs/threadpool_device/threadpool_device.o)
Execution platform: @local_execution_config_platform//:platform
external/org_tensorflow/tensorflow/core/common_runtime/threadpool_device.cc:19:10: fatal error: external/llvm_openmp/include/omp.h: No such file or directory
19 | #include "external/llvm_openmp/include/omp.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
Target //tensorflow_serving/model_servers:tensorflow_model_server failed to build
INFO: Elapsed time: 0.945s, Critical Path: 0.02s
INFO: 3 processes: 3 internal.
FAILED: Build did NOT complete successfully
cp: cannot stat 'bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server': No such file or directory
Describe alternatives you've considered
If I remove the mkl
build flag, the build would succeed
Additional context
Add any other context or screenshots about the feature request here.
Bug Report
If this is a bug report, please fill out the following form in full:
System information
Please just build the following docker file. It is adapted from cpu devel image In particular, see line 105 and 111 for the build options
# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASE_IMAGE=ubuntu:20.04
FROM $BASE_IMAGE as base_build
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=America/Los_Angeles
ARG TF_SERVING_VERSION_GIT_BRANCH=master
ARG TF_SERVING_VERSION_GIT_COMMIT=HEAD
LABEL maintainer="Abolfazl Shahbazi <[email protected]>"
LABEL tensorflow_serving_github_branchtag=${TF_SERVING_VERSION_GIT_BRANCH}
LABEL tensorflow_serving_github_commit=${TF_SERVING_VERSION_GIT_COMMIT}
RUN apt-get update && apt-get install -y --no-install-recommends \
automake \
build-essential \
ca-certificates \
curl \
git \
libcurl3-dev \
libfreetype6-dev \
libpng-dev \
libtool \
libzmq3-dev \
mlocate \
openjdk-8-jdk\
openjdk-8-jre-headless \
pkg-config \
python-dev \
software-properties-common \
swig \
unzip \
wget \
zip \
zlib1g-dev \
python3-distutils \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py && \
rm get-pip.py
# Install python
ARG PYTHON=python3.8
ENV PYTHON=$PYTHON
RUN add-apt-repository ppa:deadsnakes/ppa && \
apt-get update && apt-get install -y \
${PYTHON} ${PYTHON}-dev python3-pip ${PYTHON}-venv && \
rm -rf /var/lib/apt/lists/* && \
${PYTHON} -m pip install pip --upgrade && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/${PYTHON} 0
# Make ${PYTHON} the default python version
RUN update-alternatives --install /usr/bin/python python /usr/bin/${PYTHON} 0
RUN $PYTHON -m pip --no-cache-dir install \
future>=0.17.1 \
grpcio \
h5py \
keras_applications>=1.0.8 \
keras_preprocessing>=1.1.0 \
mock \
numpy \
portpicker \
requests \
--ignore-installed six>=1.12.0
# Set up Bazel
ARG BAZEL_VERSION=4.2.1
ENV BAZEL_VERSION=${BAZEL_VERSION}
WORKDIR /
RUN mkdir /bazel && \
cd /bazel && \
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE && \
chmod +x bazel-*.sh && \
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
cd / && \
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download TF Serving sources (optionally at specific commit).
# WORKDIR /tensorflow-serving
# RUN curl -sSL --retry 5 https://github.com/tensorflow/serving/tarball/${TF_SERVING_VERSION_GIT_COMMIT} | tar --strip-components=1 -xzf -
RUN git clone -b r2.8 https://github.com/tensorflow/serving.git /tensorflow_serving
WORKDIR /tensorflow_serving
# FROM base_build as binary_build
# Build, and install TensorFlow Serving
ARG TF_SERVING_BUILD_OPTIONS="--config=mkl --config=release"
RUN echo "Building with build options: ${TF_SERVING_BUILD_OPTIONS}"
ARG TF_SERVING_BAZEL_OPTIONS=""
RUN echo "Building with Bazel options: ${TF_SERVING_BAZEL_OPTIONS}"
RUN bazel build --color=yes --curses=yes \
${TF_SERVING_BAZEL_OPTIONS} \
--verbose_failures \
--output_filter=DONT_MATCH_ANYTHING \
${TF_SERVING_BUILD_OPTIONS} \
tensorflow_serving/model_servers:tensorflow_model_server && \
cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \
/usr/local/bin/
# Build and install TensorFlow Serving API
RUN bazel build --color=yes --curses=yes \
${TF_SERVING_BAZEL_OPTIONS} \
--verbose_failures \
--output_filter=DONT_MATCH_ANYTHING \
${TF_SERVING_BUILD_OPTIONS} \
tensorflow_serving/tools/pip_package:build_pip_package && \
bazel-bin/tensorflow_serving/tools/pip_package/build_pip_package \
/tmp/pip && \
pip --no-cache-dir install --upgrade \
/tmp/pip/tensorflow_serving_api-*.whl && \
rm -rf /tmp/pip
# Copy openmp libraries
RUN cp /root/.cache/bazel/_bazel_root/*/execroot/tf_serving/bazel-out/k8-opt/bin/external/llvm_openmp/libiomp5.so /usr/local/lib/
ENV LIBRARY_PATH '/usr/local/lib:$LIBRARY_PATH'
ENV LD_LIBRARY_PATH '/usr/local/lib:$LD_LIBRARY_PATH'
# FROM binary_build as clean_build
# # Clean up Bazel cache when done.
RUN bazel clean --expunge --color=yes && \
rm -rf /root/.cache
CMD ["/bin/bash"]
@hsl89,
Can you clarify if you mean that you're trying to install tf 2.8.0-rc0 and not tfx? Because the latest stable release of TFX is 1.5.0
and not 2.8.0
. If you're looking out to build Tensorflow with MKL and not tfx, you can also use this guide as reference. Thanks
@hsl89,
Closing this issue due to lack of recent activity. Please feel free to reopen the issue with more details if you still have questions. Thanks!
@sanatmpa1
sorry for the late reply. No I was trying to build tensorflow/serving (this repo) r2.8 branch from source, and I encountered the error posted in the description when building with mkl flag
This is happening for me too. Can we re-open the issue? @sanatmpa1
@pindinagesh any updates?
cc: @TensorFlow-MKL @agramesh1
@hsl89 FYI, we have incorporated oneDNN (MKL) support into official TensorFlow x86 builds since TF 2.5. You can build TF with normal config (without --config=mkl
) and turn on oneDNN optimizations by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1
. (And disable it by setting it to 0.)
More info about the TF_ENABLE_ONEDNN_OPTS
flag here.
cc: @TensorFlow-MKL @agramesh1
CCing @ashahba
@penpornk Thanks for the pointers! FYI I tried to remove --config=mkl
and set TF_ENABLE_ONEDNN_OPTS =1
, now I can build tfserving with tensorflow.
mkl build still failed (not related to your change) due to (See https://github.com/tensorflow/serving/blob/c0998e13451b9b83c9bdf157dd3648b2272dac59/tensorflow_serving/tools/docker/Dockerfile.devel-mkl#L123-L124)
# Copy openmp libraries
RUN cp /root/.cache/bazel/_bazel_root/*/execroot/tf_serving/bazel-out/k8-opt/bin/external/llvm_openmp/libiomp5.so /usr/local/lib/
and there is no such a file /root/.cache/bazel/_bazel_root/*/execroot/tf_serving/bazel-out/k8-opt/bin/external/llvm_openmp/libiomp5.so
.
I think tfserving team should be able to look into it where is this .so
file after your ONEDNN change.
Hi @penpornk , thanks for the update. sorry I was not able to follow-up on this thread more timely. I wonder what's the difference between --config=mkl
and --config=mkl_open_source_only
. In the .bazelrc
we have
build:mkl --define=build_with_mkl=true --define=enable_mkl=true --define=build_with_openmp=true
build:mkl --define=tensorflow_mkldnn_contraction_kernel=0
# This config option is used to enable MKL-DNN open source library only,
# without depending on MKL binary version.
build:mkl_open_source_only --define=build_with_mkl_dnn_only=true
build:mkl_open_source_only --define=build_with_mkl=true --define=enable_mkl=true
build:mkl_open_source_only --define=tensorflow_mkldnn_contraction_kernel=0
Is it true that if we set --config=mkl
, then TF will try to build against the closed source version of mkl?
Hi @hsl89 please look at that PR and should fix your issue.
Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!
@penpornk @agramesh1 I am also experiencing the same issue while trying to remove all MKL ML related configuration. The included file "omp.h" does not exist in the specified folder of llvm_openmp.
I will keep you updated if there is any progress.