rules_k8s Dependency on volatile-status.txt breaks remote cache

Bazel pretends that the volatile file never changes

However, that is untrue when it comes to the remote cache.

Our CI environment runs e2e tests that depend on a k8s_object on every run even when there are no changes due to the interaction between rules_k8s, stamping, and the remote_cache.

I'd like to propose dropping the dependency on the volatile file, while retaining the stable file, or at least making it optional as a workaround since the Bazel maintainers are pushing back on fixing the underlying issue.

Mar 06 '20 20:03 mariusgrigoriu

I found that depending on the .substituted.yaml output file is a workaround to this remote caching issue. The downside is that there may be cases when the test doesn't run due to some changes in the rule related to stamping.

Mar 07 '20 01:03 mariusgrigoriu

General best practice here is to avoid stamping -- stamping is by definition volatile.

Another practice we use over in kubernetes/test-infra is to separate out the stamp layer into its own final layer.

AKA instead of making the final stamped layer include a large go binary in it, we put this into an intermediary layer.

https://github.com/kubernetes/test-infra/blob/35cfa7b7d66098fabedb8fb1c0e15be29088971e/prow/def.bzl#L26-L49

In other words instead of using the rules_docker container_image directly we wrap this into a prow_image macro which separates out the stamped layer from the large binary layer.

This way 99% of the image layers remain reproducible (and therefore cached) and just the tiny final layer with stamp info changes between builds (assuming the image hasn't changed)

Jun 20 '20 07:06 fejta

@mariusgrigoriu and @fejta I created https://github.com/bazelbuild/rules_k8s/pull/592 as another attempt at addressing this issue. It exposes the stamp_srcs attribute as a means to override volatile-status.txt (and stable-status.txt if desired).

Jul 21 '20 13:07 mmikitka

That looks like a creative solution!

We've settled into adding the .substituted.yaml output and a custom resolver as data dependencies on e2e tests for our project. The test uses those dependencies only to trigger e2e execution. The custom resolver substitutes values like cluster and namespace, which are different per pipeline run so we don't run e2e with every commit.

I think the PR might simplify things a bit so we can just have the k8s_object rule as the data dependency as long as our resolver gets its values from sources other than the stable and volatile status files. If we did want to use those files, we might be stuck either way, but it would require some testing.

Jul 21 '20 17:07 mariusgrigoriu

Found a use case where depending on the .substituted.yaml is problematic and that's with k8s_objects. There isn't a single file to depend upon so the cleanest and easiest way would be to depend on the rule itself, so looks like #592 could be useful with that.

Jul 24 '20 00:07 mariusgrigoriu

Just tested PR #592. I wasn't able to find a way to include only the stable status file as I don't know what the label for the generated stabel-status.txt is. However, I was able to not depend on any stamp files by referencing an empty filegroup. That said, it would be nicer if you could set stamp_srcs=[] or stamp=False to accomplish that goal. All in all, this seems to work if you don't want to rely on any of bazel's stamps, including the stable ones.

Jul 24 '20 05:07 mariusgrigoriu

@mariusgrigoriu I am using the following approach, which concatenates and filters out values from stable-status and volatile-status:

genrule(
    name = "all-status",
    visibility = ["//visibility:public"],
    outs = [
        "all-status.txt",
    ],
    cmd = ("cat bazel-out/stable-status.txt bazel-out/volatile-status.txt" +
           "| grep -Ev '(BRANCH_TAIL|BUILD_TIMESTAMP|BUILD_HOST|BUILD_NUMBER|GIT_COMMIT_BODY)' > $(location :all-status.txt) "),
    stamp = True,
)

k8s_object(
    name = "foo",
    ...
    stamp_srcs = ["//:all-status"],
)

All the best!

Aug 14 '20 13:08 mmikitka

The above example helps a lot. I think this works good enough. What do you think about including stamp_srcs in the k8s_default? I'm finding that we want every rule to have it, so it would be nice if we could set it as a default. Happy to submit PR / issue on it.

Oct 02 '20 00:10 mariusgrigoriu

rules_k8s rules_k8s copied to clipboard

Dependency on volatile-status.txt breaks remote cache

rules_k8s
rules_k8s copied to clipboard