bazel
bazel copied to clipboard
Looks like `volatile-status.txt` content affects the caching key on shared cache
Description of the bug:
I want at the same time:
- upload new application only if application was actually changed;
- add stamping information to application.
As far as I understand, if only the volatile-status.txt
file has changed between builds, then this should not cause the application to be rebuilt.
For me, changing this file is ignored only for two consecutive builds on the same working copy.
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Minimal example repository: https://github.com/bozaro/bazel-stamping Steps for reproduce:
- checkout
- execute
./stamping.sh
(this script show artifact builds difference at last step)
Expected result: empty diff and exit code 0
Actual result: non-empty diff and exit code 1
What stamping.sh do?
.bazelrc:
build --disk_cache=bazel-cache
What stamping.sh
do:
# Generate bazel-bin/stamping.txt based on stamping.sh and volatile-status.txt
bazel build :stamping --stamp
tee pass-1.txt < bazel-bin/stamping.txt
# Wait one second to have differ BUILD_TIMESTAMP in volatile-status.txt
sleep 1
# Remove local bazel cache (but disk cache is still present)
bazel clean
# Generate bazel-bin/stamping.txt based on stamping.sh and volatile-status.txt
bazel build :stamping --stamp
tee pass-2.txt < bazel-bin/stamping.txt
# Compare first and second pass
diff pass-1.txt pass-2.txt
Which operating system are you running Bazel on?
Ubuntu 22.04.1 LTS
What is the output of bazel info release
?
release 5.3.0
If bazel info release
returns development version
or (@non-git)
, tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD
?
$ git remote get-url origin; git rev-parse master; git rev-parse HEAD
[email protected]:bozaro/bazel-stamping.git
599f3d8a1df088443d6f5868c43bd0f59b48aaf3
599f3d8a1df088443d6f5868c43bd0f59b48aaf3
### Have you found anything relevant by searching the web?
_No response_
### Any other information, logs, or outputs that you want to share?
_No response_
This is a duplicate of https://github.com/bazelbuild/bazel/issues/10075, note in particular the discussion starting from https://github.com/bazelbuild/bazel/issues/10075#issuecomment-546872111.
I saw #10075 issue, but did not pay due attention to it, since it was closed an eternity ago - in 2019.
It is especially frustrating that the local workspace and local disk caches have such different behavior.
I will think about how to solve the seemingly trivial task of saving a commit, from which you can build the executable file again. With current behaviour, I should completely find and remove all volatile-status.txt
usage (even third-party dependency) from the our project.
I tried to generate a caching key without volatile-status.txt
digest (in this case action key
!= sha256(action bytes)
). It's works perfectly on local disk_cache
, but stuck on bazel-buildfarm.
Looks like action key
must be sha256(action bytes)
.
In this case looks like correct volatile-status.txt
behaviour can't be implemented without protocol modification: volatile data should be outside Action data.
If you care about this, I think the next step would be to start a discussion with the Remote Execution API working group to add protocol support for it.
https://groups.google.com/g/remote-execution-apis/c/Fg72Ewwmim0
The documentation should probably be updated to reflect this behavior, if it's intended, or is too fundamental to the architecture to change.
"Bazel expects them to change all the time, like timestamps do, and duly updates the bazel-out/volatile-status.txt file. In order to avoid rerunning stamped actions all the time though, Bazel pretends that the volatile file never changes. In other words, if the volatile status file is the only file whose contents has changed, Bazel will not invalidate actions that depend on it. If other inputs of the actions have changed, then Bazel reruns that action, and the action will see the updated volatile status, but just the volatile status changing alone will not invalidate the action."
should probably become something more like:
"Like timestamps, Bazel expects them to change, and updates the bazel-out/volatile-status.txt file. To avoid rerunning stamped actions, Bazel servers pretend that the volatile file never changes. In other words, if the volatile status file is the only file whose contents have changed, a local Bazel server will ignore the new contents of volatile-status.txt unless other inputs of the actions have changed. When Bazel reruns a stamped action, the action will see the updated volatile status."
to highlight the limitations of these scheme especially as they apply to solving problems like https://github.com/bazelbuild/bazel/issues/7466#issuecomment-490815937.
In my case.
Stamping is an extremely useful feature that allows you to save information like "this file can be compiled from the sources of revision X" or "this file can be debug with source from revision X" without reassembling all executable files for each commit in the mono repository.
But in reality it only works under the conditions:
- disk cache or remote cache is not used (otherwise stamping always triggers cache miss);
- the branch on the build machine does not change (the cache only works between neighboring builds).
That is, in fact, this feature does not work.
For workaround I use patched bazel version:
- https://github.com/bazelbuild/bazel/pull/16240
It should be possible to use the --experimental_remote_scrub_config flag (new in Bazel 7) to scrub volatile-status.txt from the cache key. This will only work when using a disk/remote cache with local execution, though (remote execution support, as discussed above, would require protocol changes).
New patch version to allow use stamping with remote build for Bazel 7 (actions with stamping executes locally):
- https://github.com/bazelbuild/bazel/pull/20070