bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Looks like `volatile-status.txt` content affects the caching key on shared cache

Open bozaro opened this issue 2 years ago • 7 comments

Description of the bug:

I want at the same time:

  • upload new application only if application was actually changed;
  • add stamping information to application.

As far as I understand, if only the volatile-status.txt file has changed between builds, then this should not cause the application to be rebuilt.

For me, changing this file is ignored only for two consecutive builds on the same working copy.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Minimal example repository: https://github.com/bozaro/bazel-stamping Steps for reproduce:

  • checkout
  • execute ./stamping.sh (this script show artifact builds difference at last step)

Expected result: empty diff and exit code 0

Actual result: non-empty diff and exit code 1

What stamping.sh do?

.bazelrc:

build --disk_cache=bazel-cache

What stamping.sh do:

# Generate bazel-bin/stamping.txt based on stamping.sh and volatile-status.txt
bazel build :stamping --stamp
tee pass-1.txt < bazel-bin/stamping.txt
# Wait one second to have differ BUILD_TIMESTAMP in volatile-status.txt
sleep 1
# Remove local bazel cache (but disk cache is still present)
bazel clean
# Generate bazel-bin/stamping.txt based on stamping.sh and volatile-status.txt
bazel build :stamping --stamp
tee pass-2.txt < bazel-bin/stamping.txt
# Compare first and second pass
diff pass-1.txt pass-2.txt

Which operating system are you running Bazel on?

Ubuntu 22.04.1 LTS

What is the output of bazel info release?

release 5.3.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

$ git remote get-url origin; git rev-parse master; git rev-parse HEAD
[email protected]:bozaro/bazel-stamping.git
599f3d8a1df088443d6f5868c43bd0f59b48aaf3
599f3d8a1df088443d6f5868c43bd0f59b48aaf3


### Have you found anything relevant by searching the web?

_No response_

### Any other information, logs, or outputs that you want to share?

_No response_

bozaro avatar Sep 07 '22 11:09 bozaro

This is a duplicate of https://github.com/bazelbuild/bazel/issues/10075, note in particular the discussion starting from https://github.com/bazelbuild/bazel/issues/10075#issuecomment-546872111.

tjgq avatar Sep 07 '22 12:09 tjgq

I saw #10075 issue, but did not pay due attention to it, since it was closed an eternity ago - in 2019.

It is especially frustrating that the local workspace and local disk caches have such different behavior.

I will think about how to solve the seemingly trivial task of saving a commit, from which you can build the executable file again. With current behaviour, I should completely find and remove all volatile-status.txt usage (even third-party dependency) from the our project.

bozaro avatar Sep 07 '22 13:09 bozaro

I tried to generate a caching key without volatile-status.txt digest (in this case action key != sha256(action bytes)). It's works perfectly on local disk_cache, but stuck on bazel-buildfarm. Looks like action key must be sha256(action bytes).

In this case looks like correct volatile-status.txt behaviour can't be implemented without protocol modification: volatile data should be outside Action data.

bozaro avatar Sep 08 '22 20:09 bozaro

If you care about this, I think the next step would be to start a discussion with the Remote Execution API working group to add protocol support for it.

tjgq avatar Sep 14 '22 08:09 tjgq

https://groups.google.com/g/remote-execution-apis/c/Fg72Ewwmim0

bozaro avatar Sep 18 '22 19:09 bozaro

The documentation should probably be updated to reflect this behavior, if it's intended, or is too fundamental to the architecture to change.

"Bazel expects them to change all the time, like timestamps do, and duly updates the bazel-out/volatile-status.txt file. In order to avoid rerunning stamped actions all the time though, Bazel pretends that the volatile file never changes. In other words, if the volatile status file is the only file whose contents has changed, Bazel will not invalidate actions that depend on it. If other inputs of the actions have changed, then Bazel reruns that action, and the action will see the updated volatile status, but just the volatile status changing alone will not invalidate the action."

should probably become something more like:

"Like timestamps, Bazel expects them to change, and updates the bazel-out/volatile-status.txt file. To avoid rerunning stamped actions, Bazel servers pretend that the volatile file never changes. In other words, if the volatile status file is the only file whose contents have changed, a local Bazel server will ignore the new contents of volatile-status.txt unless other inputs of the actions have changed. When Bazel reruns a stamped action, the action will see the updated volatile status."

to highlight the limitations of these scheme especially as they apply to solving problems like https://github.com/bazelbuild/bazel/issues/7466#issuecomment-490815937.

drewmacrae avatar Dec 20 '22 15:12 drewmacrae

In my case.

Stamping is an extremely useful feature that allows you to save information like "this file can be compiled from the sources of revision X" or "this file can be debug with source from revision X" without reassembling all executable files for each commit in the mono repository.

But in reality it only works under the conditions:

  • disk cache or remote cache is not used (otherwise stamping always triggers cache miss);
  • the branch on the build machine does not change (the cache only works between neighboring builds).

That is, in fact, this feature does not work.

For workaround I use patched bazel version:

  • https://github.com/bazelbuild/bazel/pull/16240

bozaro avatar Dec 20 '22 20:12 bozaro

It should be possible to use the --experimental_remote_scrub_config flag (new in Bazel 7) to scrub volatile-status.txt from the cache key. This will only work when using a disk/remote cache with local execution, though (remote execution support, as discussed above, would require protocol changes).

tjgq avatar Oct 20 '23 14:10 tjgq

New patch version to allow use stamping with remote build for Bazel 7 (actions with stamping executes locally):

  • https://github.com/bazelbuild/bazel/pull/20070

bozaro avatar Jan 15 '24 05:01 bozaro