remote-apis icon indicating copy to clipboard operation
remote-apis copied to clipboard

bazel: Migrate to bazel 7; use bzlmod for dependencies

Open minor-fixes opened this issue 1 year ago • 15 comments
trafficstars

This change performs a large cleanup, attempting to migrate this repo to building using the latest major version of bazel, as well as to using bzlmod for dependencies instead of WORKSPACE.

In order to demonstrate that builds still work and the repo is more useful for downstream users, an example submodule project is added, that contains:

  • sample MODULE.bazel boilerplate
  • sample proto generation targets for C++, Go, and Java
  • sample applications that import gRPC generated code for C++, Go, and Java

In pursuit of the above, this change:

  • modifies how protos are built:
    • rules are inlined into the BUILD file next to each proto, instead of residing in a subdir
    • different rules are imported, depending on language
    • switched_rules WORKSPACE macros are removed; using bzlmod should make dependency resolution easier, relaxing the need for such a macro
  • updates go.mod to the format supported by the latest Go toolchain
  • removes unused gRPC bazel rules
  • adopts a patch for googleapis based on https://github.com/bazelbuild/rules_go/issues/3685#issuecomment-1711854475

Tested:

  • cd examples/sample_project && bazel build //... works
  • bazel build //... in top-level directory works

minor-fixes avatar Jan 15 '24 07:01 minor-fixes

@minor-fixes Adding "examples" to .bazelignore at the root would avoid Bazel recursing into that directory

mortenmj avatar Jan 17 '24 19:01 mortenmj

@minor-fixes Adding "examples" to .bazelignore at the root would avoid Bazel recursing into that directory

Thanks! I don't know why I expected the presence of a WORKSPACE file in a subdir to stop bazel, but the .bazelignore fix works great!

minor-fixes avatar Jan 19 '24 18:01 minor-fixes

I started with creating bindings in the sample application actually - and reversed course after I realized that I'm not sure what the long-term ramifications of doing specifically that for C++ and ODR violations (say one depends on two projects that each generate, build, and link against their own bindings in their respective libraries - does that work? I only know enough C++ to know that I don't know C++ - and that was back in 2016).

As far as that is an issue, the easiest way to not have that issue is to have one set of bindings after dependency resolution, and then compile those once to be linked in where needed.

It'd be a shame, too - bazel is finally giving us the tools to manage dependencies sanely, instead of the WORKSPACE shenanigans we've had to deal with up until this point. As much as I'd like to be mad about having to learn bzlmod, it seems to work well for the modules in the registry (indeed, most of the issues are with the modules not in the registry!) and it does look to be more concise and tractable than WORKSPACE.

My gut reaction is to go the other way, and put some maintenance elbow grease in (which I'm happy to do! Maybe others will join in over time) because this is a fairly slim repo, and we can work to keep it slim and maximally useful. Hopefully this puts us on the path to being an entry in the Bazel module registry, which will make it easier to import, and the examples should help new projects get spun up more quickly.


I might not be aware of the reasons for avoiding binding generation, though - looking at the ecosystem(s) Google is pushing, it's not clear what "best practice" is - maybe a Bazel expert can weigh in here? What technique (doing binding gen vs. not) builds the best ecosystem at large?

minor-fixes avatar Feb 13 '24 03:02 minor-fixes

I love this PR, thank you for the bzlmod support! One request before this is merged.. Can we rename the java_proto_library rules? I'd like to reserve the _java_grpc suffix to be generated by java_grpc_library(). Thank you!

Example:

java_proto_library(
-    name = "remote_asset_java_grpc",
+    name = "remote_asset_java_proto",
     deps = [":remote_asset_proto"],
)

jasonschroeder-sfdc avatar Mar 04 '24 20:03 jasonschroeder-sfdc

I love this PR, thank you for the bzlmod support! One request before this is merged.. Can we rename the java_proto_library rules? I'd like to reserve the _java_grpc suffix to be generated by java_grpc_library(). Thank you!

Example:

java_proto_library(
-    name = "remote_asset_java_grpc",
+    name = "remote_asset_java_proto",
     deps = [":remote_asset_proto"],
)

+1

bergsieker avatar Mar 08 '24 19:03 bergsieker

I don't really have any idea how to test this, so we may find that it's necessary to roll back if it turns out to break someone.

Can you expound on what we would consider "breaking"? Between the bazel client version update, bzlmod, and possibly changing target names, I do expect users to need to adapt when updating to latest; though, I also don't expect orgs with business-critical builds to be pulling tip of main in production builds.

minor-fixes avatar Mar 10 '24 21:03 minor-fixes

@bergsieker @EdSchouten Would you mind reviewing and merging? Thank you in advance!!

jasonschroeder-sfdc avatar Mar 15 '24 16:03 jasonschroeder-sfdc

@bergsieker @EdSchouten Would you mind reviewing and merging? Thank you in advance!!

I don't have permissions to merge, so I guess @bergsieker will need to do that. I just looked through the diff, and I can't see anything out of the ordinary. That said, I still haven't found any time to play around with bzlmod. So not seeing anything wrong doesn't mean it's right... 😆 So maybe just ship it?

EdSchouten avatar Mar 16 '24 19:03 EdSchouten

I stacked a few commits on this over here https://github.com/bazelbuild/remote-apis/pull/293

keith avatar Mar 27 '24 21:03 keith

I've just come across this PR (disclaimer: I work on Bzlmod). Thank you for your contribution! And sorry about the long delay -- I'm dedicating time to fixing the googleapis + grpc + remoteapis situation for Bzlmod in Q3, so my hope is that we can land some concrete improvements within the next month.

My main question regarding this PR is the choice to do this:

rules are inlined into the BUILD file next to each proto, instead of residing in a subdir

Putting all language rules in the same BUILD file presents the challenge that we'll need to load all those rulesets when we need just one of them. Having each language be in a subdir at least makes it so that a user project only needs to load the rulesets they need. ("load" here meaning to evaluate those .bzl files, creating targets for the binding targets, etc.)

But there's a deeper problem -- the fact that this module depends on rules_{cc,java,go} at all already means that we're pulling down more than we need. ("pull down" here meaning to actually fetch the ruleset modules -- which happens because of toolchain registration.) Ideally, this module contains only protos, and users of language-specific bindings can somehow only use the languages they need.

I've only just started looking into this area, so I don't have a concrete suggestion as to how that should happen. A naive idea is to have a sub-Bazel-module for each language binding -- so remoteapis as the base module that contains only protos, and then remoteapis-java for Java bindings, remoteapis-go for Go bindings, etc. That's clean but may be too much maintenance burden. Another setup I'm considering (assuming that the binding generation is just calling a binary) is to generate bindings on the fly using module extensions and repo rules.

I'm talking to the googleapis team (https://github.com/googleapis/googleapis) whose project setup has largely the same problem (and googleapis is in fact a dependency of remote-apis). We have a meeting next Monday, so I'll report back after that and coordinate some sort of effort to fix these together.

Wyverald avatar Jul 12 '24 22:07 Wyverald

Thanks, I'm excited to see it land! While working on this, I ran into some fundamental proto+bazel ecosystem questions that maybe you can help clarify:

  • Should binding generation always happen in the module with the protos, or always happen in a dependent module? (I thought I ran into an assertion in generating protobuf bindings from a separate module at one point)
    • If always done from the same module with protos - how do we avoid maintainers of proto modules having to add targets for languages that they might not care about (but downstream users do)?
    • If always done from a different module - how are ODR-type issues resolved, if two different targets exist in the dependency graph that generate bindings from the same proto?
  • Is a single repo with multiple modules a valid option as well?

minor-fixes avatar Jul 16 '24 15:07 minor-fixes

If there are no extra arguments needed, an aspect on top of the proto_library should be able to define the generating rules for each language. The problem comes when certain annotations are needed, where should they be defined?

moroten avatar Jul 16 '24 20:07 moroten

Should binding generation always happen in the module with the protos, or always happen in a dependent module?

In my mind, the ideal state would be that the binding generation happens in a separate module, but one that's shared by everyone using that language. So remote-apis has only protos, and remote-apis-java has Java bindings that all Java people use, remote-apis-py has Python bindings, etc. This avoids ODR violations.

Is a single repo with multiple modules a valid option as well?

Yes -- we can have subdirectories with their own MODULE.bazel files. Then we either build custom release archives, or have everyone use the same archive (containing the whole repo) and rely on strip_prefix.

If there are no extra arguments needed, an aspect on top of the proto_library should be able to define the generating rules for each language. The problem comes when certain annotations are needed, where should they be defined?

Could you give an example of what annotations you're referring to?

Wyverald avatar Jul 16 '24 20:07 Wyverald

If there are no extra arguments needed, an aspect on top of the proto_library should be able to define the generating rules for each language. The problem comes when certain annotations are needed, where should they be defined?

Could you give an example of what annotations you're referring to?

I was thinking about for example importpath in go_proto_library. That is solved with individual modules for each language as you suggest.

moroten avatar Jul 16 '24 20:07 moroten

I can't believe we didn't merge this or any of the other "bump version of dependency X" PRs for over half a year, so now I just wasted 2 days on making my own version of this, because I didn't see this great PR deep down in the open pull request list. 😿

Here's my take on it: https://github.com/bazelbuild/remote-apis/pull/307

It's simpler in various ways and for what I know follows all current best practices / "agreed upon workarounds" for things like the go-genproto / googleapis / cloud.google.com migration situation, and it doesn't require any patches of external repos. A lot of this is probably just due to Bazel 7.x and the bzlmod ecosystem maturing more since then.

This PR adds tests / examples, which is really nice, though.

I don't mind which one gets merged, as long as we make progress on getting the ecosystem fixed by migrating more things to bzlmod and cleaning up technical debt.

philwo avatar Aug 13 '24 13:08 philwo