rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

gazelle: Visibilities are incorrect for projects with multiple `# gazelle:python_root` directives

Open aryamccarthy opened this issue 1 year ago β€’ 15 comments

🐞 bug report

Affected Rule

The issue is caused by the rule: https://github.com/bazelbuild/rules_python/blob/b106f91c9da7e31d73a7293e88ac78eedcad2057/gazelle/python/generate.go#L215

Is this a regression?

No

Description

This is necessary in monorepo situations where there are multiple # gazelle:python_root directives used, e.g. in the structure below:

ROOT
    proj_1
        src  # is a python root and depends on proj_2
            foo
    proj_2
        src  # is a python root
            bar  ← The visibility here will be incorrect; proj_1 cannot see this.

As it stands, we write a narrowly scoped visibility attribute for every generated rule; the visibility is //{python-root-package}:__subpackages__. Bazel will fail to build targets in proj_1 because of this.

πŸ”₯ Exception or Error





🌍 Your Environment

Operating System:

  
macOS 14.2
  

Output of bazel version:

  
Build label: 7.0.0
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Mon Dec 11 16:52:42 2023 (1702313562)
Build timestamp: 1702313562
Build timestamp as int: 1702313562
  

Rules_python version:

  
0.27.1
  

Anything else relevant?

The desired implementation is not to set everything to //visibility:public, rather, it's to resolve cross-project dependencies in a monorepo the same way as they are within one directory.

Consider a working example here: https://github.com/bazelbuild/rules_python/compare/main...aryamccarthy:rules_python:main

aryamccarthy avatar Jan 11 '24 01:01 aryamccarthy

I think this is something that you may want only if the gazelle roots are using the same requirements files, so this ask is not necessarily something that is useful for all scenarios and I am not sure it should be the default.

aignas avatar Jan 11 '24 04:01 aignas

@aignas I posted a reply here: https://github.com/bazelbuild/rules_python/pull/1683#discussion_r1449064691. Regarding the requirements files, that's true regarding external dependencies. Our main issue has to do with internal dependencies between the different Python projects in our Bazel workspace themselves.

eaplatanios avatar Jan 11 '24 16:01 eaplatanios

Continuing the discussion her.

I think the best way would be to support the default visibility directive that gazelle has to support your usecase.

Just to ensure that we are not missing anything here - if you manually modify the visibility of a talget you want to expose, does it get overwritten in the consecutive gazelle runs?

aignas avatar Jan 11 '24 23:01 aignas

I agree that supporting the # gazelle:default_visibility directive would be ideal. (I mentioned that in #1681.) Until that's implemented (and I would have to learn Go to do it), this gets monorepos with shared requirements files working. And for repos that want the existing behavior, it's a one-line addition to their BUILD file (which @eaplatanios noted):

package(default_visibility = ["//" + package_name() + ":__subpackages__"])

If you manually modify the visibility, it's not overwritten. That's true (and good!). But this saves a lot of manual effort the first time the monorepo gets BUILD files generated by Gazelle, in making those updates. The problem is that the Gazelle-inserted visibility can be too restrictive, but it overrules any package-level visibility declarations. Every single target would need to have the visibility altered after Gazelle is run. And any targets that are added later also need to be corrected.

The calculus that motivated this was: a one-line default_visibility attribute once is much cheaper than N visibility updates and having to perennially make fixes as the codebase grows.

aryamccarthy avatar Jan 12 '24 00:01 aryamccarthy

I think the best way forward is to support the gazelle:default_visibility directive. I don't have the bandwidth to do this right now. @aignas, do you? Otherwise, @aryamccarthy, Go is really easy to pick.

f0rmiga avatar Jan 13 '24 19:01 f0rmiga

I don't have bandwidth to work on this either.

aignas avatar Jan 14 '24 10:01 aignas

It seems the current behavior is erroneous, and either path would ameliorate this. The one in PR #1683 fixes the bug, and processing # gazelle:default_visibility directives could be seen as an additional feature later on.

The Gazelle source code (in a comment) says that if there's a package(default_visibility=...) stanza, then no visibility declaration should be generated. This helps build the case that the current behavior is wrong. Even if we don't "fix" it by respecting default_visibility directives, our PR resolves that.

aryamccarthy avatar Jan 16 '24 16:01 aryamccarthy

This code was added 6 years ago: https://github.com/bazelbuild/bazel-gazelle/commit/cdc77d08956c8f4b9a0509878cda3b9ce9e02095. This is when only the Go generation existed. This is not a rule. If it were, it would be enforced by the framework.

f0rmiga avatar Jan 16 '24 16:01 f0rmiga

@f0rmiga @aignas we're currently blocked from using gazelle python in our codebase due to this issue. What is your suggestion for moving forward assuming we do not have the bandwidth to learn Go and your codebase to make the ideal changes? Is there any chance you'd approve the current PR or a variation of it or should we look into other solutions that do not involve gazelle-python?

And to be clear, the main difference between the # gazelle:default_visibility solution and what the current PR enables afaiu is that you have to type this line in your BUILD file:

# gazelle:default_visibility ...

instead of this one (which the current PR enables):

package(default_visibility = ["//" + package_name() + ":__subpackages__"])

Could you please explain why this is a bad solution? Or at least why you think it's worse than what gazelle-python currently does which is enforcing a stronger opinion that is not configurable at all for the rule visibilities?

eaplatanios avatar Jan 16 '24 19:01 eaplatanios

I appreciate the time you've put into this discussion and raising a point that makes your adoption of the Python extension for Gazelle difficult.

As an open-source project, we have to be careful with changes to the behaviour of this extension, especially one that affects everyone else, in order to satisfy your use case. I'm not saying that your use case is wrong or that it doesn't deserve attention, but the optimal solution has been laid out, and you are not committed to helping improve the code generator (even though you are blocked on it). A few companies that provide Bazel consulting would be happy to contribute to this issue if you hire them.

What is your suggestion for moving forward assuming we do not have the bandwidth to learn Go and your codebase to make the ideal changes?

Find someone who has the bandwidth to contribute (whatever it means - including learning Go). There's also the possibility of writing targets by hand, which lots of people do. Seems to be working fine for them.

Could you please explain why this is a bad solution? Or at least why you think it's worse than what gazelle-python currently does which is enforcing a stronger opinion that is not configurable at all for the rule visibilities?

You are asking for everyone in the community to fit your workflow and not the other way around. The Gazelle code generator is optional and it is opinionated.

f0rmiga avatar Jan 17 '24 00:01 f0rmiga

It also occurred to me that you could probably just use https://github.com/aspect-build/plugin-fix-visibility to fix your visibility issues.

f0rmiga avatar Jan 17 '24 00:01 f0rmiga

One more reason why it may be a sub-optimal solution - sharing dependencies between different workspaces than have different requirements files can lead to unpredictable behaviour when two versions of the same package are in the dependency graph. The default visibility settings are there to enforce a sane and deterministic behaviour.

If your use case does not involve mixing of third party dependencies then it would be great if you could describe it here.

aignas avatar Jan 17 '24 13:01 aignas

@f0rmiga thanks we'll try to use that plugin.

@aignas thanks for that reasoning! This makes sense and it's not a scenario I had in mind. Though I'd like to point that with the changes in the PR the default visibility will be //visibility:private unless explicitly overriden using package(default_visibility = ...), and so the default behavior will still be sane and deterministic iiuc.

Regarding our use case, we have a large monorepo with a single set of external Python/Rust/etc. dependencies for the whole repo. Then, each project in that repo lives in a different directory and only pulls in the dependencies it needs. We perform dependency resolution globally for the whole repo and so do not run into the situation of having different projects depend on different versions of the same package.

eaplatanios avatar Jan 17 '24 14:01 eaplatanios

@eaplatanios, one problem with the //visibility:private by default is that then users would need to manually annotate all of the new BUILD.bazel files they create with visibility settings which is a regression from what we have today. Currently bazel run //:gazelle yields a working BUILD.bazel file for those cases, but if we merged the PR, it would not (for new projects, because gazelle would still honour the existing visibility settings created by previous versions of the plugin).

Regarding the usecase, as I understand correctly your setup is as follows:

  1. You have a single gazelle manifest file generated from a single set of requirements.
  2. You have import paths to start not at the root of the repository (e.g. your monorepo has paths like src1/myproject/foo and src2/myproject2/bar and you want to have the import paths foo and bar for using them in your Python code)
  3. You use the # gazelle:python_root in src1/myproject/BUILD.bazel and src2/myproject2/bar is used to indicate the project roots.

I think this is a real feature request that the gazelle plugin does not currently support. For supporting this feature we could use the default_visibility directive. I am not sure if this means that we have to extend the approach in #1683 (i.e. rules_python plugin should not set the visibility and the default visibility extension from upstream gazelle just works) or if there needs to be a different approach.

aignas avatar Jan 18 '24 00:01 aignas

@aignas what you describe as our use case is accurate. Thanks for summarizing it!

I only have a couple comments on the rest of your message.

You mentioned that:

one problem with the //visibility:private by default is that then users would need to manually annotate all of the new BUILD.bazel files they create with visibility settings which is a regression from what we have today.

I believe this is not entirely correct and you only need to add package(default_visibility = ["//" + package_name() + ":__subpackages__"]) at the top-level BUILD.bazel file in your repo (@aryamccarthy can correct me if I'm wrong here but I believe we tried something like that and it seemed to work). If that is indeed the case, then it is similar in terms of complexity for the user to the default_visibility directive. Though I'm not 100% sure I'm correct about package(default_visibility = ...) and its scope.

I am not sure if this means that we have to extend the approach in https://github.com/bazelbuild/rules_python/pull/1683 (i.e. rules_python plugin should not set the visibility and the default visibility extension from upstream gazelle just works) or if there needs to be a different approach.

Yeah I'm also not sure without trying but unfortunately we don't have the necessary background to try this out (i.e., neither familiarity with Go nor with Gazelle implementation details like how to use that extension from upstream Gazelle here). :(

eaplatanios avatar Jan 18 '24 04:01 eaplatanios

Related: adding a new python_visibility directive, which would work similar to the go_visibility directive, and would allow the user to add a list of labels that should be appended to the visibility list.

Example usage:

# gazelle:default_visibility //src:__subpackages__
# gazelle:python_visibility //tests:__subpackages__
# gazelle:python_visibility //foo/bar:all

Would result in this visibility def:

visibility = [
    "//src:__subpackages__",
    "//tests:__subpackages__",
    "//foo/bar:all",
]

dougthor42 avatar Feb 27 '24 06:02 dougthor42