bazel-gazelle icon indicating copy to clipboard operation
bazel-gazelle copied to clipboard

module graph pruning in 1.17

Open derekperkins opened this issue 4 years ago • 2 comments

Will the upcoming release of 1.17 and the changes it makes to go.mod/go.sum have any impact on gazelle? https://tip.golang.org/ref/mod#graph-pruning

derekperkins avatar Aug 10 '21 03:08 derekperkins

https://sourcegraph.com/github.com/bazelbuild/bazel-gazelle@d7082fb305422c27d9a33530e6e0f6be463ba178/-/blob/language/go/modules.go?L152:6

It seems that gazelle is depending on the shell call to go list [flags] all

Modules whose requirements have been pruned out still appear in the module graph and are still reported by go list -m all: their selected versions are known and well-defined, and packages can be loaded from those modules (for example, as transitive dependencies of tests loaded from other modules). However, since the go command cannot easily identify which dependencies of these modules are satisfied, the arguments to go build and go test cannot include packages from modules whose requirements have been pruned out. go get promotes the module containing each named package to an explicit dependency, allowing go build or go test to be invoked on that package.

So go list might need to provide a flag to filter out pruned modules for this to work 🤔 ?

sluongng avatar Aug 17 '21 09:08 sluongng

This would be amazing to have. Currently Gazelle adds all the modules, including the pruned ones. Then they are all downloaded during the build even if not needed. Is that correct or does bazel not download the unused ones? I'm not sure how repository rules work in this case.

ash2k avatar May 07 '22 02:05 ash2k

So go list might need to provide a flag to filter out pruned modules for this to work thinking :thinking:?

After some digging, I now believe that the information is there in the go list output: pruned modules don't have a GoMod attribute.

As a demo, the following snippet uses jq to select those entries without a GoMod attribute and this lists the modules that appear to be pruned (no entry in the go.sum file).

 go list  -mod=readonly -e -m -json all | jq -rs 'map(select(.GoMod|not).Path)|sort|.[]'

Here is the logic that fills this GoMod field: https://github.com/golang/go/blob/bd8ec78b08ead1fb34ec8dc7bc4bf2ff7a9e8b82/src/cmd/go/internal/modload/build.go#L345-L352 It checks whether there is a corresponding go.sum entry. As far as I understand now, this is missing (only) for pruned modules.

matzf avatar Feb 23 '23 10:02 matzf

I forgot that I implemented this concept in a hacky, non-bazel way a while ago in https://github.com/sluongng/nogo-analyzer/blob/master/tools/prune-deps.sh. So feel free to take a look.

Overall, there is not a lot of benefit in doing these "pruning" for go_repository in a monorepo: go_repository rules are lazy by default and will only get executed/downloaded if it's required for a build. So even if gazelle update-repos were to generate some extra go_repository, those will not be executed most of the time.

It matters, however, when you try to create some starlark rules library for downstream consumption. Downstream repos will have to load your dependencies and may / may not accidentally overwrite their own with yours. This problem class is being solved with BzlMod and the module extension in Gazelle: https://github.com/bazelbuild/bazel-gazelle/blob/master/internal/bzlmod/go_deps.bzl where go.mod and go.sum are parsed directly in starlark. I believe there are still some work left to iron out (i.e. supporting go.work, allowing overrides) but update-repos is effectively going away.

So overall, I think we could safely close this as "wont-fix" unless others have a strong objection.

sluongng avatar Feb 23 '23 10:02 sluongng

Overall, there is not a lot of benefit in doing these "pruning" for go_repository in a monorepo: go_repository rules are lazy by default and will only get executed/downloaded if it's required for a build. So even if gazelle update-repos were to generate some extra go_repository, those will not be executed most of the time.

My very unscientific experiments with removing the pruned go_repository rules seem to confirm this. Despite removing almost 200 pruned dependencies, I did not observe any notable difference when running our bazel builds or queries.

To me, the main argument in favor of fixing this would be to reduce confusion/surprise by this mismatch between go and gazelle. It also seems very easy to fix, by just ignoring modules with empty GoMod when processing the go list output:

filter-pruned-modules.patch
diff --git a/language/go/utils.go b/language/go/utils.go
index 1b2b8cf..b64c7b2 100644
--- a/language/go/utils.go
+++ b/language/go/utils.go
@@ -50,9 +50,9 @@ var goModDownload = func(dir string, args []string) ([]byte, error) {
 // modulesFromList is an abstraction to preserve the output of `go list`.
 // The output schema is documented at https://go.dev/ref/mod#go-list-m
 type moduleFromList struct {
-       Path, Version, Sum, Error string
-       Main                      bool
-       Replace                   *struct {
+       Path, Version, GoMod, Sum, Error string
+       Main                             bool
+       Replace                          *struct {
                Path, Version string
        }
 }
@@ -86,6 +86,9 @@ func extractModules(data []byte) (map[string]*moduleFromList, error) {
                if mod.Main {
                        continue
                }
+               if mod.GoMod == "" {
+                       continue
+               }
                if mod.Replace != nil {
                        if filepath.IsAbs(mod.Replace.Path) || build.IsLocalImport(mod.Replace.Path) {
                                log.Printf("go_repository does not support file path replacements for %s -> %s", mod.Path,

Unsolicited advice for your prune-deps.sh script; use comm instead of diff :wink:

matzf avatar Feb 28 '23 08:02 matzf