How to create deterministic layers?
π bug report
Affected Rule
The issue is caused by the rule:container_run_and_commit_layercontainer_image(maybe)
Is this a regression?
Unsure
Description
When building a container_run_and_commit_layer target multiple times, the hash is not deterministic.
However, container-diff shows no differences at a file-level.
π¬ Minimal Reproduction
https://github.com/njlr/bazel-run-commit
WORKSPACE
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "bazel_skylib",
urls = [
"https://mirror.bazel.build/github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
"https://github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
],
sha256 = "f7be3474d42aae265405a592bb7da8e171919d74c16f082a5457840f06054728",
)
load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")
bazel_skylib_workspace()
http_archive(
name = "io_bazel_rules_docker",
sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)
load(
"@io_bazel_rules_docker//repositories:repositories.bzl",
container_repositories = "repositories",
)
container_repositories()
load("@io_bazel_rules_docker//repositories:deps.bzl", container_deps = "deps")
container_deps()
load(
"@io_bazel_rules_docker//container:container.bzl",
"container_pull",
)
container_pull(
name = "dotnet_runtime_deps_6_0_10",
registry = "mcr.microsoft.com",
repository = "dotnet/runtime-deps",
tag = "6.0.10-bullseye-slim-amd64",
digest = "sha256:24554fadd483d8305974ded44bb1dbe4916e2f02500b9e2d78e7beb557cfebd0"
)
BUILD.bazel
load("@io_bazel_rules_docker//container:container.bzl", "container_image")
load("@io_bazel_rules_docker//docker/util:run.bzl", "container_run_and_commit_layer")
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")
container_run_and_commit_layer(
name = "install_git",
image = "@dotnet_runtime_deps_6_0_10//image",
commands = [
" && ".join([
"apt-get update -y",
"apt-get install -y git=1:2.30.2-1",
"apt-get clean",
"rm -rf /var/lib/apt/lists/*",
"rm -rf /var/cache/ldconfig/aux-cache",
"rm -rf /var/log/alternatives.log",
"rm -rf /var/log/apt/term.log",
"rm -rf /var/log/apt/history.log",
"rm -rf /var/log/dpkg.log",
"rm -rf /var/log/*",
"rm -rf /var/cache/debconf/templates.dat",
"rm -rf /var/lib/dpkg/status-old",
"rm -rf /var/lib/dpkg/status",
"rm -rf /var/cache/debconf/config.dat",
"rm -rf /etc/ld.so.cache",
"rm -rf /var/lib/apt/extended_states",
"rm -rf /var/log/apt/eipp.log.xz",
"git --version",
]),
],
)
container_image(
name = "image",
base = "@dotnet_runtime_deps_6_0_10//image",
layers = [
":install_git",
],
)
copy_file(
name = "image_archive",
src = ":image.tar",
out = "image_archive.tar",
is_executable = False,
allow_symlink = False,
)
test.sh
#!/bin/bash
set -e
set -o pipefail
rm -rf ./bazel-*
bazel clean
bazel build //:image_archive
sha256sum bazel-bin/image_archive.tar
rm -rf ./bazel-*
bazel clean
bazel build //:image_archive
sha256sum bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
bazel-bin/image_archive.tar
INFO: Elapsed time: 42.970s, Critical Path: 42.10s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
d51cbfa26560fe671e13655b0baa94a3d8426b4cc3a8726c2e4a2e05585ebc6b bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
bazel-bin/image_archive.tar
INFO: Elapsed time: 60.050s, Critical Path: 59.20s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
3b80585ed7dcf7f27590e48bb48b89d59ce6a1660f6ced7f081711c5e64fd064 bazel-bin/image_archive.tar
π₯ Exception or Error
N/A
π Your Environment
Operating System:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04 LTS
Release: 22.04
Codename: jammy
Output of bazel version:
bazel --version
bazel 5.3.1
Rules_docker version:
http_archive(
name = "io_bazel_rules_docker",
sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)
Anything else relevant?
Nope
Curiously, this seems to work:
container_image(
name = "image",
base = "@dotnet_runtime_deps_6_0_10//image",
layers = [
- ":install_git",
],
+ tars = [
+ ":install_git",
+ ],
)
Also strange is that the hash on GitHub CI and my machine differ.
You call tools in your container which aren't hermetic, like apt-get install - so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.
You call tools in your container which aren't hermetic, like
apt-get install- so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.
There are commands to clean up the noise from apt-get (although it is possible something was missed). It appears to be deterministic when using tars but not layers.
This fix also seems to improve remote cacheability, and may help solve #2195.