rules_go icon indicating copy to clipboard operation
rules_go copied to clipboard

Using rules_go with go modules and generated packages

Open robbertvanginkel opened this issue 4 years ago • 13 comments

First, I'm not sure if I should file this on rules_go, bazel-gazelle or golang. Please let me know if there's a better forum.

The experience of building go code with rules_go/gazelle works pretty well in general. Unfortunately when using both go modules for dependency management and rules for autogenerating go packages (such as go_proto_library or gomock), the experience breaks down a bit.

Consider the following project:

--- BUILD.bazel ---
load("@bazel_gazelle//:def.bzl", "gazelle")

# gazelle:prefix github.com/example/project
gazelle(name = "gazelle")

--- cmd/main.go ---
package main // import "github.com/example/project/cmd"

import "fmt"

func main() {
       fmt.Println("Hello!")
}

With a standard workspace file (I tested go1.13.3, rules_go v0.20.1, gazelle 0.19.0, bazel 1.1.0) it is straightforward to get the program running:

$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!

Adding a dependency with go mod is also straightforward:

$ diff cmd/main.go
diff --git a/cmd/main.go b/cmd/main.go
index d938538..55e7774 100644
--- a/cmd/main.go
+++ b/cmd/main.go
@@ -1,7 +1,11 @@
 package main // import "github.com/example/project/cmd"

-import "fmt"
+import (
+       "fmt"
+
+       "github.com/gofrs/uuid"
+)

 func main() {
-       fmt.Println("Hello!")
+       fmt.Printf("Hello %v!\n", uuid.Must(uuid.NewV4()))
 }
$ go get github.com/gofrs/uuid@latest
$ go mod tiy
$ bazel run //:gazelle -- update-repos -from_file=go.mod
$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!

When starting to consume some generated code, at first all seems fine:

diff --git a/cmd/main.go b/cmd/main.go
index 55e7774..4d3fae0 100644
--- a/cmd/main.go
+++ b/cmd/main.go
@@ -3,9 +3,11 @@ package main // import "github.com/example/project/cmd"
 import (
        "fmt"

+       "github.com/example/project/proto"
        "github.com/gofrs/uuid"
 )

 func main() {
        fmt.Printf("Hello %v!\n", uuid.Must(uuid.NewV4()))
+       fmt.Printf("Hello %v!\n", proto.Polyglot{})
 }

Where github.com/example/project/proto is a generated golang package:

--- proto/BUILD.bazel ---
# stub rule for generated go code, in practice imagine proto/gomock rules here
genrule(
    name = "genproto",
    cmd = "echo 'package proto\n\ntype Polyglot struct{}' > $@",
    outs = ["proto.go"],
)

After generating the rules everything seems to work fine:

$ bazel run //:gazelle
$ bazel run //cmd
Hello 03ec1161-9f7d-43f2-80d3-95522e517d7a!
Hello {}!

But when at a later stage, you try to add a new dependency to your project, go get will still work, while go mod tidy will start throwing errors like:

$ go get golang.org/x/text
$ go mod tidy
github.com/example/project/cmd imports
	github.com/example/project/proto: git ls-remote -q https://github.com/example/project in /Users/robbert/gocode/pkg/mod/cache/vcs/48e4a55da23b18d4dd53d568f6a9a78ee3195ecd4c570168b7f17b2d37a13a26: exit status 128:
	fatal: could not read Username for 'https://github.com': terminal prompts disabled
Confirm the import path was entered correctly.
If this is a private repository, see https://golang.org/doc/faq#git_https for additional information.

So far, we'd come up with the following to get around this:

  • Add dummy.go files with // +build ignore in directory corresponding to the importpath for generated modules.
  • Have all generated packages have importpaths starting with gen/. This made go1.12 modules think they were part of the standard library and not worry about them, but go1.13's modules require the first path component contain a . somewhere.

As far as I've read into the design and code of go modules, it seems pretty tightly coupled with the default go build system. That's unfortunate, as we like Bazel due to some codegen/caching features, but not being able to use both the standard go modules dependency management resolution logic and generated go packages at the same time is a bit sour.

Are there any known workarounds to this? Maybe there is a way for rules_go/bazel could inform go modules about which packages are expected to exist? Or would we need to open an issue with golang to see if the modules functionality can somehow be exposed for usage with 3rd party buildsystems?

robbertvanginkel avatar Oct 26 '19 01:10 robbertvanginkel

I have been using the dummy.go trick as well.

priyendra avatar Oct 27 '19 11:10 priyendra

So if I can summarize a bit, the issue is that you depend on a package that only contains generated code (github.com/example/project/proto), but go mod tidy and other Go commands report errors when you import that package because it doesn't contain any static .go files.

You're correct that Go modules are very much integrated into the go command. Code generation is very much not integrated into the go command. There is go generate, but that's not part of the regular build, and it only works in the main module. That does make generated code somewhat difficult to handle.

There are a number of workarounds, some of which you've already found.

  • Check in generated files at released versions. They don't necessarily have to be on the main development branch, just make sure they're present in commits referenced by tags like v1.5.2. This is necessary for full compatibility with non-Bazel users.
  • Check in dummy.go files with // +build ignore. This gives you minimal compatibility, as you've found.
  • Edit go.mod directly and avoid commands that load and resolve imports to generated files. This gives you limited dependency management capability without too much extra work.

Beyond that, I'm not sure I have a generally good solution to recommend for you. Kubernetes ran into some of the same issues. Bazel appealed to them because it seemed like they could remove a lot of their generated code from their repo, but non-Bazel users still needed to import their packages, so they were never really able to do that.

I'm open to solutions on the Gazelle side that don't diverge too far from what the go command does. Currently, gazelle update-repos -from_file=go.mod runs go list -m -json all gathers information about modules in the build list, then translates that into go_repository rules. After that, Gazelle has fairly minimal interaction with modules.

It's unlikely that fully general code generation will be integrated into go build. That would mean Go would need to build and execute tools written in other languages. It might need to interact with other dependency management systems. I think we'd end up with a worse version of Bazel if we followed that path.

jayconrod avatar Oct 29 '19 20:10 jayconrod

That pretty much sums it up.

To add some clarifications on our situation: the repository we use the go.mod file in is an internal monolithic repository with a collection of service and library code. The goal for using go modules is to manage a single version for the dependencies for all projects in it. There is no intention of making this module importable, so the generated code for non bazel users isn't a major concern for us.

Manually editing the module file could be a possibility, but manually having to trim the module and sum file would be error prone with a large group of developers. I guess what I'm looking for is some way to do maintainence like go mod tidy but with information about the existing packages and imports comping from Bazel/rules_go rather than go build.

robbertvanginkel avatar Oct 29 '19 23:10 robbertvanginkel

You may want a custom tool for this. At one point, I wanted gazelle update-repos to be able to do this kind of thing, but it seemed like in the general case, there would be scaling and correctness problems with a large number of repos.

There are primitives available you may find useful.

  • The golang.org/x/mod repository has some packages that handle modules. These were originally part of the go command, but we're gradually pulling functionality out into an external module so that these kinds of tools can be written. In particular, golang.org/x/mod/modfile, golang.org/x/mod/module, and golang.org/x/mod/semver will be helpful.
  • For adding or updating dependencies, go get -d should work. It doesn't require targets to be packages. Alternatively, you can just edit go.mod (but be careful to use valid pseudoversions). go mod download -json example.com/mod@latest may help, too.
  • Pruning dependencies is the hard part. bazel query 'deps(//...)' should give you a list of labels of all transitive dependencies (use whatever pattern is appropriate). You could parse out repo names and remove any external repos not listed there, then remove the corresponding modules from go.mod if they are listed as direct dependencies.

jayconrod avatar Oct 30 '19 14:10 jayconrod

@jayconrod

I would like to hear your feedback on another approach we have thought of for our monorepo, to help go modules work with generated packages.

So like you said, the source of our problem is go mod tidy reports errors when some code within a module imports a package that doesn't contain any static .go files. Meaning, if we never feed those packages to go mod tidy, then it will not complain.

The idea is that we can filter all of the imports in our repo to only what go mod tidy cares about, and then place the filtered imports into a single .go file, let's call it imports.go. Our filtering function for imports.go should remove all internal imports (including the imports of generated code), and leave us with only external imports. It essentially will look like this:

 (all imports in every go file) - (all importpath attrs known to bazel under //...)

We then can move our go.mod/go.sum outside of the main module root and into a directory containing only imports.go and tools.go. Within this "shadow module" is where commands such as go get -d and go mod tidy will be run. Before go mod tidy is run, we will have to make sure imports.go is up to date.

With this "shadow module" approach, we will be able to use go modules for dependency management, and bazel for building and code generation.

Known limitations:

  1. go build in module mode will not work - this is ok for us, as we use either bazel build or vendored go build in gopath mode
  2. our source module will not be importable from other repos - this is fine because we have a monorepo
  3. go modules will not know about generated code's external dependencies- this is already the case for us today, but if it ever becomes an issue, we could modify our filtering logic to include these imports

Are there any other limitations or gotchas you think we may be missing?

blico avatar Dec 06 '19 02:12 blico

@blico I think that will work.

How are you planning to list all imports? I can think of a couple different ways.Something built with bazel query 'deps(//...) might work, printing the importpath attribute for every library. An aspect would work, too, though it's more complicated.

Once you have that list, using an imports.go / tools.go file in a shadow module should work fine.

jayconrod avatar Dec 06 '19 15:12 jayconrod

Our idea right now is to:

  1. bazel query --output=proto //...:* to find all of the non-external srcs and importpath attributes known to Bazel.
  2. Read all of the imports from the .go srcs gathered in the first query (using go/parser)
  3. From the imports collected in the second step, filter out any imports that also were found in the first step

Because we are using Bazel as our source of truth with this approach, an added requirement for users is they will need to run gazelle before running go mod tidy, whereas previously it was possible for them to only run go mod tidy.

blico avatar Dec 09 '19 23:12 blico

@blico, @robbertvanginkel, are you still using the same approach? I'm bumping into this as well.

c4milo avatar Mar 18 '21 05:03 c4milo

Yes, we are still using @blico's approach (I am in the same team as @blico and @robbertvanginkel).

linzhp avatar Mar 18 '21 05:03 linzhp

A solution that I came up with to this problem is the following:

Suppose we have a directory of proto files. Add a file gen.go to it with following content:

package pb

//go:generate protoc --go_out=module=<dir>:. --go-grpc_out=module=<dir>:. some.proto

Add *.go and BUILD.bazel (because it will depend on the generated code) files to .gitignore for that directory. Add an exclusion rule for gen.go. Depending on the structure of your monorepo you get away with just a couple of lines.

Disable proto rule generation in gazelle using # gazelle:proto disable in the root BUILD.bazel.

Now to get everything running for a freshly cloned repo do the following:

go generate ./...
go mod tidy
bazel run //:gazelle -- update-repos -from_file=go.mod -to_macro=repositories.bzl%go_repositories
bazel run //:gazelle
bazel build //...

This solution should work for any code you can generate using a //go:generate comment. Also a benefit might be that all LSP related things just work, theGOPACKAGESDRIVER is still experimental and I had quite some problems with it.

Note that this pushes dependency management of code generators to the underlying system (your CI or Container this runs in).

Ideally there should be a native solution for this problem but for the moment this is simple and robust.

clstb avatar Jul 31 '21 11:07 clstb

After investigating all above solutions, I think using the stackb rules is best for me: https://github.com/stackb/rules_proto

Just use the proto_compiled_sources rule, then use the following script to update protobuf generation. (Assuming all protobuf files are stored in proto directory).

	echo "Cleaning up existing generated protobuf files..."
	#find "proto/" -name BUILD.bazel -delete # uncomment if you rely on gazelle to generate rules.
	find "proto/" -name "*.pb.go" -delete
	echo "Generating bazel BUILD rules..."
	bazel run //:gazelle
	echo "Compiling protobuf files..."
	bazel query "kind('proto_compile rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel build
	bazel query "kind('proto_compile_gencopy_run rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel run
	bazel query "kind('proto_compile_gencopy_test rule', //proto/...)" | tr '\n' '\0' | xargs -0 -n1 bazel test

In my case, both my upstream repositories and downstream repositories do not use bazel (yeah I am the only one promoting it due to my past Googler experience), so proto_compiled_sources provide the most compatibility. With that bazel can live with go mod tidy without problem.

ql-owo-lp avatar Mar 23 '22 21:03 ql-owo-lp

Currently we're just adding empty Go files to generated packages and excluding said files from Gazelle. The empty Go files do not have build constraints (//go:build ignore) as the Go toolchain will still assume the package to be empty otherwise. Not great, but not terrible either. It would be really nice for Gazelle to handle things better.

empty.go
package mock
BUILD.bazel
# gazelle:exclude **/mock/empty.go

One thought I had: I wonder if it's possible to use the GOPACKAGESDRIVER (https://github.com/bazelbuild/rules_go/issues/512) with gopls to tidy modules?

uhthomas avatar May 26 '22 15:05 uhthomas

@blico IS the tool you use open source?

andruwm avatar Jul 06 '22 19:07 andruwm