include grpc-xds library
to enable support of the xds protocol over grpc for remote builds/caching
@tmhdgsn, Could you please take a look at the failing checks?
Hey @satyanandak took a look, lots of unknown host exceptions for github.com and storage.googleapis.com
Can it be re-run?
@satyanandak @sgowroji sorry I can see the checks got re-rerun but the test logs still show host exceptions for github and storage.googleapis.com
Can someone take a look ? Thanks
Why do you need this in Bazel itself?
We would like this so that we can use xds to manage our traffic and auth policies for all of bazel’s communication. Without this we have to mange a bunch of proxying infrastructure across all developer hosts. This seemed a reasonable ask as it is grpc-native integration, supported by all the main language implementations.
@Wyverald - does this seem like a reasonable ask? Thanks!
Could you help me understand how this PR by itself allows you to, well, do anything differently? Are there future PRs planned? Or are you running a patched version of Bazel yourself? Or, just by including the xds library when building Bazel itself, are you now able to specify some xds-specific flags/protocols when using Bazel? (Sorry for the confusion -- I know very little about grpc/xds)
just by including the xds library when building Bazel means we can give it an "xds:///" address for remote builds/caching and it understands how to resolve it
I see, that answers the question. Please rebase the PR and address the check failures, then we can import.
sure will rebase - can we get some help with the check failures? It looks like its failing with Unknown host exceptions for github.com and storage.googleapis.com
ah. it's because we need to build the distribution archive that's used to bootstrap a Bazel build, and the bootstrapping process must not access the internet.
Looks like you need to add a line "grpc-java++grpc_java_repositories_extension+com_github_cncf_xds", to https://github.com/bazelbuild/bazel/blob/20f9a5e8258dce0d30882a3a2fdb4219596fb1ae/repositories.bzl#L64
hmm. the failure on arm64 macOS looks legit. This adds ~100MB to a ~380MB archive, which is a rather big increase (~26%).
This test doesn't fail on other platforms because apparently arm64 macOS is the "bottleneck"; the install base is only ~200MB on Linux, and this PR adds ~50MB (similar ratio of increase). I wonder if the arm64 macOS binary is big because of Rosetta?
Digging through history, the install base size has been as high as 460MB before, so 480MB isn't unthinkable. But cc-ing @meteorcloudy to double-check. (Also, if anyone has ideas how to reduce this size increase, please share.)
This PR will increase the Bazel binary size from 56M to 67M on macOS arm64 platform, which is quite a huge bump.
~~But I do notice we probably have duplicated protobuf dependencies in the Bazel binary already after https://github.com/bazelbuild/bazel/pull/23767, let me check how much we could save after deduplicating.~~
This is not true, rules_jvm_external somehow redirects @maven//:com_google_protobuf_protobuf_java to @@protobuf+//:protobuf_java.
This is the diff of files included in //src/main/java/com/google/devtools/build/lib/bazel:BazelServer_deploy.jar https://gist.github.com/meteorcloudy/53bccf216c66e49427615a3e2b824644
/cc @meisterT @lberki How do you feel about bumping the Bazel binary size to support the xds protocol?
/cc @comius @tjgq since this is remote caching/execution related.
Many of the new files look unused. Could we add an integration test to verify that this new feature actually works and then rely on proguard to strip out unneeded deps similar to what we do for fastutil?
If that doesn't work, could we provide a way for users to supply their own jar files with grpc plugins for additional connection schemes?
And if this feature only relies on the correct jars being available during runtime, can users just inject them using --host_jvm_args at runtime, similar to https://bazel.build/docs/user-manual#memory-tracking
Do I understand correctly that this increase the size of the Bazel binary only by a single-digit number of megabytes, but the install base by ~50MB?
If so, where does the increase come from? I assume it's mostly Java classes, but then what does all that code do?
Do I understand correctly that this increase the size of the Bazel binary only by a single-digit number of megabytes, but the install base by ~50MB?
Bazel binary increases is 11MB not exactly single digit, the install base is much bigger because it's decompressed from the Bazel binary.
the install base is much bigger because it's decompressed from the Bazel binary.
I get that. What I'm asking is what does this 50MB of code do. It seems to be a lot for what it apparently does.
Yeah, maybe we can use proguard to largely trim it down.
Much of the code is generated protobufs defined by Envoy. For the binary uploaded to Maven Central we cherry-pick individual files and it is 10 MB. Testing right now with a deploy.jar and the deps I see here, I see an 18 MB increase by adding grpc-xds. I noticed 4 MB of that was old/unused: https://github.com/grpc/grpc-java/pull/11667 .
Although there's a few cases of extra files brought in with Bazel, I think most of the remaining 4 MB increase compared to Maven Central is not real, simply because it comes from transitive deps. Of those transitive deps, the only other potentially-unnecessary-given-the-current-architecture dependency I'm seeing powers google_default credentials; that's bringing in 3 MB of deps. We'd have to use reflection to avoid that.
This is not true, rules_jvm_external somehow redirects
@maven//:com_google_protobuf_protobuf_javato@@protobuf+//:protobuf_java.
That would be because of grpc-java adds maven.overrides. Any mixing of java_proto_library with maven_install needs them.
Thank you all for looking at this ; re google_default_credentials, we would very much like that to also ‘just work’, so if we could leave that 3mb in that would be great. (Unless I’m misunderstanding the implication of ‘potentially-unnecessary-given-the-current-architecture’ of course!)
Yeah, to use google_default creds you'd need those deps. The potentially-unnecessary-given-the-current-architecture was just that we already have the plumbing to make that more optional. Some other things could theoretically be more optional, but would take a more concerted effort.
So what are our next steps here exactly please? Are we comfortable with the size bump? @ejona86 there was an ask for a test, which seems reasonable. What do you think would be a reasonable sanity check?
Thank you all again!
A 20% size bump is huge, so we should consider other options first, one of which might be what Eric suggests.
Alternatively, did you try out what Yun suggested above?
I'm afraid I'm not very up-to-date with gRPC and therefore I don't know what the official one-line description "xDS is the protocol initially used by Envoy, that is evolving into a universal data plan API for service mesh." means.
Given the large size increase, the benefits must be commensurate. So I'd either need to be convinced that this is something very, very good, the size penalty would need to be reduced or we'd need to find a way to make this code optional.
Bazel@HEAD currently clocks in at 98MB compressed, 205MB uncompressed according to my rudimentary measurements (bazel build //src:bazel; cd $(bazel-bin/src/bazel info install_base); du -sh .).
@ejona86 how would these numbers change in the best version and what are those bytes spent on? If I understand correctly that it's +10MB/+50MB, I'd much rather not merge this unless the benefits are truly fantastic.
Bazel@HEAD currently clocks in at 98MB compressed
I believe this is without -c opt, it's much smaller with this option. (56M for me on macOS arm64)
Indeed, I didn't realize -c opt also compresses the binary more. 62MB for me on Linux instead of 98MB (the 205MB number is unchanged, though)