vHive icon indicating copy to clipboard operation
vHive copied to clipboard

CNN serving image build is broken

Open ustiugov opened this issue 4 years ago • 5 comments

Describe the bug CNN serving image is broken due to an update in a dependency.

To Reproduce the error appears in the nightly testing

Expected behavior Successful build

Logs https://github.com/ease-lab/vhive/runs/2436625774?check_suite_focus=true#step:3:133

Workaround Use pre-built image vhiveease/cnn_serving on DockerHub.

ustiugov avatar Apr 30 '21 12:04 ustiugov

I tried rebuilding this recently - since I made changes in my copy of server.py to help with my use-case. I see errors in Docker build about untrusted signature with these lines:

    echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories && \
    echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories && \

Eventually this too gives errors: apk add --allow-untrusted --repository http://dl-3.alpinelinux.org/alpine/edge/testing hdf5 hdf5-dev && \ I tried rebuilding the base image (tatsushid/alpine-py3-tensorflow-jupyter) since that was a suggestion in one of the search results. That fails with this message:

[1mERROR: [0m/tmp/bazel-0.7.0/src/java_tools/junitrunner/java/com/google/testing/coverage/BUILD:29:1: Building src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage.jar (9 source files) failed (Exit 1): java failed: error executing command 
  (cd /tmp/bazel_XXMiCOln/out/execroot/io_bazel && \
  exec env - \
    LC_CTYPE=en_US.UTF-8 \
[0m[91m  /usr/lib/jvm/java-1.8-openjdk/bin/java -XX:+TieredCompilation '-XX:TieredStopAtLevel=1' -Xbootclasspath/p:third_party/java/jdk/langtools/javac-9-dev-r4023-3.jar -jar bazel-out/host/bin/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/bootstrap_deploy.jar @bazel-out/local-opt/bin/src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage.jar-2.params).
[0m[91mjava.lang.InternalError: Cannot find requested resource bundle for locale en_US

There is a warning message that could be related: .[0m[91m[35mWARNING: [0m/tmp/bazel_XXMiCOln/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_XXMiCOln/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions.

adayaru avatar Aug 22 '22 06:08 adayaru

@adayaru I suggest re-doing this image atop of python-slim instead of Alpine, as it is much easier to get right and maintain in a long term. We don't plan this work as of now so your contribution is more than welcome.

ustiugov avatar Aug 29 '22 11:08 ustiugov

Hi @ustiugov, @adayaru,

I have tried to hack my way to get it working.. The issue is due to rotation of the keys since Alpine 3.15.

I manually downloaded the required packages and have them installed in the docker image. Working copy is here. 006bdda92b9eafa5b77cb97d3897062c26ae4806

However, same problem is present in rnn_serving image as well. it is also not getting built due to the same issue. However, i could not get it to work.

niravnshah avatar Feb 21 '23 19:02 niravnshah

@niravnshah could you please open a PR with the fix?

ustiugov avatar Feb 22 '23 11:02 ustiugov

@ustiugov, I am not sure if we can call this as a fix :) I have explicitly downloaded x86_64 packages, meaning, it will not work for aarch64. Also, I am not sure if having a fixed version of apk is a good idea (they may become stale also, that's why we have those package managers). I guess, this needs to be looked from the base image perspective, and if that needs to/should be changed.

niravnshah avatar Feb 22 '23 11:02 niravnshah