rules_codeowners
rules_codeowners copied to clipboard
Missing features to port Angular monorepo
I'd love to use this for Angular:
https://github.com/angular/angular/blob/master/.github/CODEOWNERS which you can see DESPERATELY needs some help to scale
Here's a start: https://github.com/angular/angular/compare/master...alexeagle:codeowners we can only make the switch when the new test passes, currently the delta is https://gist.github.com/alexeagle/d6bf98f1c36923a77caa220edde00063
What we learn from this is:
- packages can have multiple rules. Simple motivating example: maybe I want my Bazel expert to review BUILD.bazel file but anyone can review other files
- Angular uses
@angular/framework-global-approversas a "global" approver and that should be expressible in the root of the tree, not repeated in each package
Hey, that sounds epic! Let's see what we'll have to do here to fix this!
- packages can have multiple rules. Simple motivating example: maybe I want my Bazel expert to review BUILD.bazel file but anyone can review other files
I don't know if I understood you correctly, but I think that this is already possible to achieve. There is no limit on how many codeowners targets that you can have in a single package, as long as you set different patterns.
Does the code in the following example do what you're looking for? https://github.com/zegl/rules_codeowners/blob/56e551f5ae6d37975889291ab26e4aa2494468db/tests/BUILD#L4-L28
- Angular uses @angular/framework-global-approvers as a "global" approver and that should be expressible in the root of the tree, not repeated in each package
In the Angular CODEOWNERS, it seems like the global approvers are almost global, some paths like /tools and a few security related packages are noticeably only approveable by other teams.
What do you think of an API where it's possible to set a global_approvers attribute on the generate_codeowners target, that unless otherwise specified is added to all rows.
To opt-out from the global team having access to a package/path, the codeowners target could have a attribute called include_global_approvers, that when explicitly set to False would not add the global team to the output.
Do you think that an API like that would fit your needs?
Thanks for replying!
I'm making some more progress on it.
I figured that I could add more codeowners() targets, one per pattern. It's workable but the problem is the generator target is going to have hundreds of lines in the owners list. There isn't any grouping, or a way to have a package propagate the codeowners from its subpackages in a way that preserves the encapulation.
So someone trying to update owners for their own subdir still has to nag a global approver to let them change the generate_codeowners target to add their line, and that subverts the point of distributed ownership in a monorepo.
I think we need some kind of forwarding target so there's a tree, instead of the current flat two levels of rules. Does that make sense? I can prototype something to fix it.
For global approval I've added a macro we'll call through. So a typical package has only
load("//tools:defaults.bzl", "codeowners")
codeowners(team = "@angular/fw-core")
We already load //tools:defaults.bzl in most BUILD files so this is a very small delta to land it this way.
The macro itself gets a bunch of logic to reproduce the current CODEOWNERS file. See it here: https://github.com/alexeagle/angular/blob/codeowners/tools/defaults.bzl#L328
I think a wrapper macro to set defaults for rules is a fine pattern, would suggest it to anyone who wants that feature, so I don't think we need to add anything.
I think we need some kind of forwarding target so there's a tree
Aha, I see what you're getting at. I'll hack something together and let you test it out. One generate_codeowners-rule can actually use another one as it's input, but it's duplicating some headers which is annoying.
For global approval I've added a macro we'll call through.
The macro solution is much better than what I had in mind! Once your changes hits angular/angular we can refer to that solution from the documentation in this project.
I did implement my idea anyways, it's currently available for testing in #14, but as the macro setup is more flexible, I'll likely not merge it.
nice!
Take a look at all the comments in https://github.com/alexeagle/angular/blob/codeowners/.github/BUILD.bazel - that's what's left to make the same codeowners as Angular has now.
I think this nesting feature is nicely minimal and lets a package encapsulate its own list of subpackages so that's nice.
Next, let's say a package needs 10 different owners patterns. Your scheme would require that one package BUILD file to have
codeowners(
name = "owners.first",
pattern = "first",
team = "someteam",
)
codeowners(
...
)
...
generate_codeowners(
owners = [
"owners.first",
"owners.second",
...
],
)
which is super-verbose. In particular I'm forced to choose a name for each rule and then put the name in the list, but the name is meaningless. We really want that one package to have
codeowners(
patterns = {
"first": "someteam",
...
},
)
I can probably implement that in the macro too but I think it belongs in the rule.
Final issue is that Angular's code is badly organized, all these aio/... patterns are because documentation for a feature doesn't live in the same package with the feature itself. I could imagine a scheme where a codeowners block can reach outside its package to reference some other files in another package but that's not bazel-idiomatic so I think we should find a local way to fix this.
Actually as I'm rolling it out it seems what I really want is
codeowners(
name = "OWNERS.compiler",
patterns = [
"guide/angular-compiler-options.md",
"guide/aot-compiler.md",
"guide/aot-metadata-errors.md",
],
team = "@angular/fw-compiler",
)
I've got it working in the macro so I'm not blocked.
Sweet!
I've implemented the patterns functionality in #17, the behaviour should be identical to what you've implemented in the macro.
FYI, I still find it too difficult to register all the codeowners() rules with the generation target. What we really want is just add codeowners to a package and the rest is automated.
I got mostly there by scripting it using bazel query and buildozer to automatically discover the codeowners() rules and set them as the owners attribute in generate_codeowners
readonly new_owners=$(
# Query for all codeowners() rules (anchor at the front to avoid match on generate_codeowners rule)
$BAZEL query --output=label 'kind("^codeowners rule", //...)' |
# Print the length of each label at the front of the line
awk '{ print length, $0 }' |
# Sort shortest-first, so that the root //:OWNERS ends up first in CODEOWNERS
sort -n -s |
# 43 label -> "label"
awk '{ print "\x22" $2 "\x22" }' |
# comma-separated
tr '\n' ','
)
readonly command="set owners [$new_owners]|//.github:gen_codeowners"
echo $command | $BUILDOZER -f -
See https://github.com/bazelbuild/rules_nodejs/pull/1266
maybe we should document this in the readme?
Yeah, I feel you.
In one of the repositories where I've introduced rules_codeowners, there's a CI step that verifies that every codeowners is a dependency of the generate_codeowners target in the root (via something like bazel query 'kind(codeowners, //...) except deps(//:generate_codeowners)').
So it's epic that you've taken this a step further and automated the process!
This could probably be implemented as a runable rule, but that's likely overkill. I'll add both your and my variant to the README.
Hey @alexeagle and @zegl, I'm a bit late to the party but thought I would chime in. Love the discussion and implementation of some awesome changes that are definitely moving in the right direction.
Actually as I'm rolling it out it seems what I really want is
codeowners( name = "OWNERS.compiler", patterns = [ "guide/angular-compiler-options.md", "guide/aot-compiler.md", "guide/aot-metadata-errors.md", ], team = "@angular/fw-compiler", )I've got it working in the macro so I'm not blocked.
Your example desired state (and what @zegl implemented) would still require multiple codeowners() rules when multiple teams each own multiple patterns. Wouldn't a two dimensional array be better suited for this? Something like:
codeowners(
rules = [
"@angular/fw-compiler": [
"guide/angular-compiler-options.md",
"guide/aot-compiler.md",
"guide/aot-metadata-errors.md",
],
....
],
)
I think this provides a cleaner interface when considering that one might have end up with a complex team:pattern matrix, even in smaller projects -- for example, separating ownership of graphical assets, documentation, source code, build rules, etc.
FYI, I still find it too difficult to register all the codeowners() rules with the generation target. What we really want is just add codeowners to a package and the rest is automated.
I got mostly there by scripting it using bazel query and buildozer to automatically discover the codeowners() rules and set them as the owners attribute in generate_codeowners
I think we'd be able to accomplish this by rewriting codeowners() to instead invoke native.sh_binary()... I'm still tiptoeing my way around custom rules, but I'll take a stab at it this weekend and see what I can come up with.
I thought about a more complex datastructure but I think the fact that multiple codeowners rules compose makes everything simpler and gives approx. the same ergonomics for expressing what you need.
I guess bazel run .github:codeowners.update would be a nice way to get all the codeowners registered, provided you can get the dependency on buildozer to work.
Okay, so the power outage set me back a bit, but it gave me a chance to think about how I'd like rules_codeowners to work. I'll run through this below, and if you both think it's a good direction to move, I'll work on refactoring it this weekend. The work @alexeagle did over at https://github.com/bazelbuild/rules_nodejs, as well as the discussion in this thread, lended to these thoughts.
Initialization
I think it's important that rules_codeowners allows defining global owners at the project root, separate from a codeowners() rule. Based on the pattern by many other rule sets, this should be done with something like the following:
# WORKSPACE
http_archive(
name = "rules_codeowners"
...
)
load("@rules_codeowners//:foo.bzl", "codeowners_register")
codeowners_register(
global_owners = [
"@teamA",
"@teamB",
]
)
This allows defining global owners in the global scope (e.g. at the root of the workspace), but keeps it out of any codeowners() rules for cleaner organization.
Output Path
Additionally, codeowners_register() should take a filepath parameter that defaults to .github/CODEOWNERS (or another one of the valid GitHub CODEOWNERS paths). This will be consumed at build time.
Building
In an ideal world, we'd be able to bazel build ..., and the codeowners() rules would be picked up and compiled into a file that's output to filepath. Unfortunately, we can't do that. What we can do is provide a target that is runnable and does the same work; we'd only need to provide the name -- this would be similar to other rules such as bazelbuild/bazel-gazelle and bazelbuild/buildtools, and would be invoked as such:
$ cat BUILD.bazel
load("@rules_codeowners//:foo.bzl", "generate_codeowners")
generate_codeowners(name = "codeowners")
$ bazel run //:codeowners
This would parse all of the codeowners() rules and create the generated file at the specified output path (from codeowners_register(), above).
I think following the pattern of established tool integrations like bazel-gazelle and buildifier (via buildtools) is a great way forward.
Rule Semantics
I'm not quite at the point where users are owners of indivdual files, so most of the time, I find myself simply using this in my package BUILD files:
codeowners(team = "@foobar")
I'm currently using the same macro as seen HERE to provide some nice defaults
Playing around with it, I think the pattern(s):team(s) combination we have is great, and doesn't need to change. It gives the right flexibliity for any combination of ownership.
What are your thoughts?
@zegl @alexeagle Can I get your thoughts on that :point_up_2:? I'm happy to create issues and do the work, but I'd rather we have some sort of a consensus on what the rule(s) should look like and how to use them.
I agree with you that we should have a solution to all of these issues, and largely I agree with the solution that you have already suggested.
Let's open new issues for all of these points, to make the discussion a bit more organized. 😃
Global owners
I agree that it would be nice to have a built in way to support this. Either through the generate_codeowners rule, or through a repository rule. I think that the formed would both be easier and nicer, as it's one fewer rule in total.
Runnable rules
The generate_codeowners rule should be runnable, and take a new parameter (path?) that controls where generated file should be written.
This might break someones workflow, but I think that it's for the greater good. We can still have the old .out output, to stay almost backwards compatible.
Macro
As for the macro, do you think that it would be nice to host that macro in this repo, to make adoption easier?
As for the macro, do you think that it would be nice to host that macro in this repo, to make adoption easier?
I think the need for the macro will be removed with additional changes to the library; namely, the global_owners property which we have talked about here. I think the no_parent behavior makes a lot of sense, too -- this could be added as an attribute of codeowners().
I'll create issues tonight to further the discussion and isolate the work.
Great, no macro needed then! :)