dbx_build_tools
dbx_build_tools copied to clipboard
Hermeticity bug: Package setup can find host libraries (leading to import failures)
When trying to install numpy, I found that it failed to import because it depended on libraries that it could not access - CBLAS and LAPACK - resulting in undefined symbol
errors (sometimes cblas_dot
, sometimes saxpy_
, sometimes _gfortran_concat_string
, depending on what you have installed, and what you have tried to provide via Bazel dependencies). The numpy wheel comes bundled with the necessary libraries but bypasses their installation if it finds implementations on the host system - allowing users to utilise alternative implementations suited to their needs.
A minimal reproduction in a docker image showed no problem - numpy worked fine - so the same code worked or borked depending on the host - indicative of a hermeticity issue. To verify, I installed openblas to the docker image before the numpy package is installed, and this gives it undefined symbol
errors when trying to import numpy.
I've attached a MWE. Inside is a simple project that simply imports numpy, prints hello world, and outputs a simple mean to demonstrate numpy doing something. This is done within a docker environment to isolate the build from the host machine.
There are two dockerfiles: working.Dockerfile
and broken.Dockerfile
. The only difference between the two is that broken.Dockerfile
installs libopenblas-dev
before running bazel build
.
To reproduce, download dbx_build_tools_bug.tar.gz and:
tar -xzvf dbx_build_tools_bug.tar.gz && cd dbx_build_tools_bug
docker build --network=host -t dbx_docker_broken -f broken.Dockerfile .
docker build --network=host -t dbx_docker_working -f working.Dockerfile .
docker run -it dbx_docker_working
docker run -it dbx_docker_broken
Running the working docker image outputs:
Starting local Bazel server and connecting to it...
INFO: Analyzed target //usage:example (39 packages loaded, 3812 targets configured).
INFO: Found 1 target...
Target //usage:example up-to-date:
bazel-bin/usage/example
INFO: Elapsed time: 4.154s, Critical Path: 0.16s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
Hello, world!
3.0
And running the 'broken' docker image outputs:
Starting local Bazel server and connecting to it...
INFO: Analyzed target //usage:example (39 packages loaded, 3812 targets configured).
INFO: Found 1 target...
Target //usage:example up-to-date:
bazel-bin/usage/example
INFO: Elapsed time: 4.288s, Critical Path: 0.28s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/usage/example-wrapper.py", line 42, in <module>
exec(code, module.__dict__)
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/usage/example.py", line 1, in <module>
import numpy
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/__init__.py", line 148, in <module>
from . import lib
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/lib/__init__.py", line 25, in <module>
from .index_tricks import *
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/lib/index_tricks.py", line 12, in <module>
import numpy.matrixlib as matrixlib
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/matrixlib/__init__.py", line 4, in <module>
from .defmatrix import *
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/matrixlib/defmatrix.py", line 11, in <module>
from numpy.linalg import matrix_power
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/linalg/__init__.py", line 73, in <module>
from .linalg import *
File "/root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/linalg/linalg.py", line 33, in <module>
from numpy.linalg import lapack_lite, _umath_linalg
ImportError: /root/.cache/bazel/_bazel_root/b570b5ccd0454dc9af9f65ab1833764d/execroot/__main__/bazel-out/k8-fastbuild/bin/usage/example.runfiles/__main__/pip/numpy/numpy-cpython-38/lib/numpy/linalg/lapack_lite.cpython-38-x86_64-linux-gnu.so: undefined symbol: _gfortran_concat_string
It is assumed that the system has libgfortran
available.
That isn't the case. You can try that out on the docker image by running apt install gfortran
and/or apt install libgfortran5
before bazel build
. It still can't link with libgfortran because gfortran isn't in path provided to the runtime. I've also tried providing it with libgfortran within bazel, but to no avail - not that this would solve the issue, because the developer needs to start adding dependencies that depend on whether X, Y or Z is available on the target system.
The underlying issue is that the runtime environment is hermetic but the setup environment is leaky - as far as I can tell, vpip is given access to system libraries, so any setup process that searches for available resources in the path that it has access to is prone to creating dependencies that aren't be available at runtime.
The underlying issue is that the runtime environment is hermetic but the setup environment is leaky - as far as I can tell, vpip is given access to system libraries, so any setup process that searches for available resources in the path that it has access to is prone to creating dependencies that aren't be available at runtime.
Yes, that's a well-known Bazel sandboxing hole: https://github.com/bazelbuild/bazel/issues/7313
This being the case, I'm wondering two things:
-
In Numpy, at least, a configuration can be provided using a
site.cfg
file. Is there a chance of a mid-step before download and installation that allows for swapping out configuration files before the installation? -
If setup can't be hermetic, then it's likely that other packages are also impacted - and their respective solutions may be very different. How about a wiki page which users can post approaches that they've found for setting up packages successfully?
We use a site.cfg
to point scipy
to our copy of OpenBLAS.
We avoid the hermeticity problem by building everything with a minimal sandbox root fs. https://github.com/bazelbuild/bazel/issues/6994#issuecomment-457438551
The underlying issue is that the runtime environment is hermetic but the setup environment is leaky - as far as I can tell, vpip is given access to system libraries, so any setup process that searches for available resources in the path that it has access to is prone to creating dependencies that aren't be available at runtime.
Yes, that's a well-known Bazel sandboxing hole: bazelbuild/bazel#7313
The upstream issue has been resolved, can this be closed?