protobuf
protobuf copied to clipboard
Use prebuilt binaries instead of building protoc from sources
Fetch released prebuilt binaries and use them in a platform based select. That selects appropriate binary for the platform or falls back to sources. (It's mostly just about wiring stuff together for WORKSPACE and bzlmod).
Works with both WORKSPACE and bzlmod mode. It even works with --incompatible_enable_proto_toolchain_resolution, because the only configured toolchain now points to the target that has the select.
This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.
Please ignore changes to .pb.h and versions, those are there just to make examples work. That's because version of prebuilt binaries need to match the sources.
This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.
Today we tag a release commit and then create binaries so the binaries match the github release tag. Manipulating our source archive to include a commit updating prebuilt binaries might be doable, but would make it deviate from the tagged release commit (and github's source archive) where I think we'd also have to remove prebuilts in the interim. I assume we can't have the release commit point to its own artifacts?
This PR should be applied during the release: after archives with prebuilt binaries are created (so that we can compute checksums), but before the source archive is created.
Today we tag a release commit and then create binaries so the binaries match the github release tag. Manipulating our source archive to include a commit updating prebuilt binaries might be doable, but would make it deviate from the tagged release commit (and github's source archive) where I think we'd also have to remove prebuilts in the interim. I assume we can't have the release commit point to its own artifacts?
1st option I think I could reorganize this PR, so that during a release you overwrite a single .bzl file with the checksums of prebuilt binaries. The tag would point to sources with the original .bzl file.
2nd option Since Bazel is hermetic it can generate identical binaries and .zip files every time. (I'm not sure how release zips are generated at the moment). So in theory we could overwrite checksums before the release, even before the bzl files exist. Tag them and then do the release.
3rd option
- Generate zips.
- Make a commit with new checksums.
- Tag
- Make a release
While I like the direction, what's the reason for not registering a toolchain per OS/architecture with --incompatible_enable_proto_toolchain_resolution plus the existing source-based toolchain last? That would also provide a natural opt-in to source-based toolchains: explicitly registering the source based toolchain first in the main projects MODULE.bazel or .bazelrc file.
The downsides of the select approach I see compared to toolchains are that it doesn't influence the exec platform and that prebuilt support would be limited to OS/arch combinations provided by this repo.
While I like the direction, what's the reason for not registering a toolchain per OS/architecture with --incompatible_enable_proto_toolchain_resolution plus the existing source-based toolchain last?
The registered toolchain points to //:protoc, so the select also gets used. --incompatible_enable_proto_toolchain_resolution works if there's a prebuilt protoc for first execution plaform.
If there are many execution plaforms and the first one doesn't have prebuilt binary it would fall back to the sources.
@comius - can you update this PR to resolve the merge conflicts?
@comius - can you update this PR to resolve the merge conflicts?
Done. (I rebased to 30.x)
The latest test run has issues across a wide spectrum of configurations, including:
- all Windows example tests
- all CMake tests
- all Java and C# tests
- Most objective-c tests
@JasonLunn I can fix the failures - but perhaps you missed the point, that this PR is intended to demonstrate how to use prebuilt binaries and even with fixed failures, you wouldn't be able to merge it, because it depends on a released version of binaries.
Hey all,
I've got an idea over a lunch break, how to implement this, without modifying protobuf release procedures. It wouldn't be hard to implement at all. The only caveat is, that it works only with bzlmod.
The sketch of the solution is:
- With each release we publish to BCR, not only protobuf repo, but also protobuf-linux-x86_64, protobuf-linux-x86_32, ... modules.
- Module protobuf-linux-x86_64 has as sources url pointing to protoc prebuilt binary: "https://github.com/protocolbuffers/protobuf/releases/download/v30.0-rc2/protoc-30.0-rc-2-linux-s390_64.zip"
- it is patched with BUILD file from this PR:
load("@bazel_skylib//rules:native_binary.bzl", "native_binary")
native_binary(
name = "protoc",
src = "bin/protoc",
out = "protoc",
visibility = ["//visibility:public"],
)
- it is also patched also with a trivial MODULE.bazel file.
- protobuf MODULE would depend on all of the new modules, matching the version
- releases would be tested with prebuilt binaries disabled; and laziness of bzlmod would help us not to download unexisting dependencies
- after release, all the modules start to exist on BCR and prebuilt binaries start to work
Why does this work? There's no need to compute checksums. They are computed during release to BCR and verified by bzlmod. The content of protobuf MODULE is unchanged during the release.
WDYT? Should we move forward with this?
cc @pcloudy @fmeum @meisterT
That's a pretty decent approach.
The only downside I see is that users may be led to (mistakenly) believe that they should depend on protobuf-linux-x86_64 directly when they see it in the BCR. In any case, we would have to add "nodep" deps from the protobuf-* modules back to protobuf to ensure that Minimum Version Selection doesn't end up selecting different versions for the various architectures.
This approach doesn't preclude the use of toolchains instead of select in the future, which is nice.
I agree, that's a nice formulation that avoids touching the existing release procedure for the protobuf module and allows OSS maintainers to own this setup without google3 access. I think we should use toolchains rather than select from the beginning, but this is easily done based on what's already in toolchains_protoc.