bazel
bazel copied to clipboard
Fallback to download with HttpDownloader if assets not found in remote cache
Fixes #12417.
Fixes #16264.
If I understand correctly, the design doc of this feature says as one reason for why people use it: „Some organizations would like to enforce policies on use of third-party code, and a build tool that directly downloads external dependencies will bypass technical controls that implement these policies.“ (https://github.com/bazelbuild/proposals/blob/master/designs/2020-01-14-remote-downloads.md)
By falling back to downloading via HTTP we would circumvent that. So maybe we should make this configurable? Maybe by turning the downloaders into a list of strategies, people who want both could use „grpc,http“ and people who want to enforce grpc-only would remove http from it? 🤔
OTOH I can also see the use case where people use this just as an additional cache - in which case maybe even printing a warning on cache miss is too much noise as it would happen frequently.
WDYT? (Also @jmillikin-stripe)
Sorry for missing the design doc.
Making this configurable sounds reasonable. How about using flag --experimental_remote_downloader_http_fallback? (like we added --remote_local_fallback for falling back to local execution)
The linked issue indicates a bug in the remote downloader server, not in Bazel. The server responded that it had successfully resolved the download request and had put the content into the remote cache, but then when Bazel went and tried to fetch it, the cache said that key wasn't found.
This is roughly the same behavior as what you'd get if a remote executor said it had finished an action, but didn't write the action results into the cache. Bazel doesn't automatically fall back to local execution in that case, and similarly, the download code shouldn't fall back to the local HTTP downloader with ambient authority.
More generally, I think a fallback doesn't make sense because the purpose of the remote downloader is to fully override Bazel's access to external resources. It's not just an optional lookup into the remote cache, it's a proxy that has a role in the security of a build. As one example, if a company wants to forbid downloading assets without a checksum, this fallback would bypass the remote downloader that enforces the policy.
I am agree with you that automatically fall back to http downloader without asking explicitly is not good. But I think optionally enable this could benefit other user cases e.g providing cache for the CI workers.
If this fallback is only enabled by a flag, I don't think this introduces a security issue since users can always disable remote downloader and use HTTP downloader instead if they have access to Bazel command line options.
I am agree with you that automatically fall back to http downloader without asking explicitly is not good. But I think optionally enable this could benefit other user cases e.g providing cache for the CI workers.
The current code in Bazel works fine for caching between CI workers -- we've been using it in that mode for several months.
I strongly recommend that you file a bug against the download server you're using, because it sounds like that server is returning incorrect responses. I do not think it's a good idea for Bazel to try working around buggy implementations of the remote APIs.
IMO it would feel more consistent with the other cache operations if this were to fall back to downloading locally. But as long as this is made configurable, I don't think it matters too much what the default is.