repository_macro generated in external repository causes gazelle to have slow performance
rules_go = 0.22.1
gazelle = 0.20.0
bazel = 2.0.0
Does this issue reproduce with the latest releases of all the above?
yes
What operating system and processor architecture are you using?
latest osx, 3.3 GHz Dual-Core Intel Core i7
What did you do?
Generated go_repository rules from go.mod in an external WORKSPACE so that I don't have to run update-repos and commit a large unorganized .bzl file to source. This caused a performance degradation in gazelle running bazel run gazelle as it could not locate the repository_marco to parse the existing imports.
The complete description with proposed solution is in #748
What did you expect to see?
It to be as fast as if the bzl repository macro was in my local workspace.
What did you see instead?
It was very slow.
So the issue, as I understand, is that Gazelle needs to be configured to use the same set of external repositories that Bazel is using. This is needed in both the main workspace and in go_repository rules. Without that configuration, Gazelle goes out to the network to try and resolve each external import to a module, which is slow.
Normally, when Gazelle starts in the main workspace, it reads the WORKSPACE file and looks for go_repository rules, # gazelle:repository directives, and # gazelle:repository_macro directives that point to other .bzl files. The set of repository rules it finds are "known", so it won't go out to the network to resolve imports that have one of those repositories as a prefix.
go_repository does something similar. gazelle_dependencies declares a workspace named bazel_gazelle_go_repository_config that generates a configuration file with that information. By default, that file is derived from WORKSPACE, but the go_repository_default_config parameter may be set to a label of another WORKSPACE-like file. That was added so that unrelated changes in WORKSPACE won't invalidate go_repository rules.
There's no way to set a configuration file for the gazelle rule in the main workspace though. #748 partially adds that, but it doesn't generate it the same way bazel_gazelle_go_repository_config does. They need to be consistent.
It may also make sense to be able to generate a configuration file from a .bzl file, not only a WORKSPACE file. That way, an external workspace could provide a dependency macro, and the same file could be used with Gazelle.
Would setting the configuration file/method by having # gazelle:repository_macro take a label work? This looks half implemented in the gazelle_dependencies method. Something like:
# gazelle:repository_macro @go_modules//repositories.bzl%go_repositories
This might be difficult because it will still not be a dependency on the BUILD rule, which means it might not be available to its run files when scanning.
This might be difficult because it will still not be a dependency on the BUILD rule, which means it might not be available to its run files when scanning.
That's the key issue. All of those files have to be available when Gazelle runs. Adding them to the data attribute might work, but there would need to be a good error message when they're not there.
I thought about adding them to the data attribute, but then you would still need to tell it where to look for that file.
e.g.
gazelle(
name = "gazelle",
extra_args = ["--repo-config=external/go_modules/repositories.bzl"],
data = ["@go_modules//:repositories.bzl"]
)
But this seems error prone given we would need to map it to the runfiles path.
It should already know where to look for the files from # gazelle:repository_macro directives. That should be enough to find the runfile.
I don't think it does. This was the reason for the changes in the bash script in the #748 PR.
Could you give me an example of what you think should work and I will go and try it out and report back.
Given a directive like:
# gazelle:repository_macro @go_modules//repositories.bzl%go_repositories
Gazelle should be able to find that file, if it exists. Currently, any label outside the main workspace is rejected so that would need to change.
Gazelle will not be able to ensure this file exists and is up to date on its own though. That could be done with either go_repository_config (when Gazelle is invoked through go_repository) or the gazelle rule (when Gazelle is invoked in the main workspace). go_repository_config could do this without any change to its interface, since repository rules can resolve labels dynamically. The gazelle rule would need files to be specified with the data attribute. But no additional flags should be needed to Gazelle itself, other than perhaps the existing -repo_config flag, which would keep its current meaning.
@jayconrod the above repository_macro directive did not work for me. I get
Error in fail: generate_repo_config: generate_repo_config: failed to load /home/path/to/code/@go_modules/repositories.bzl in repoRoot /home/path/to/code: open /home/path/to/code/@go_modules/repositories.bzl: no such file or directory
I have been able to hack around the problem by using build_config, but actual labels in repository_macro would be highly appreciated.
Hello @chrismgrayftsinc, I'm trying to do the same thing (like # gazelle:repository_macro @go_modules//repositories.bzl%go_repositories), how do you have solved this with build_config?
Thanks!
Since I wrote the comment I've forgotten how the hack worked. :)
The way it looks is I have two files. The first called foo_deps.bzl that has a macro
load("@bazel_gazelle//:deps.bzl", "go_repository")
def foo_deps():
go_repository(
name = "com_github_containerd_containerd",
build_file_proto_mode = "disable_global",
importpath = "github.com/containerd/containerd",
sum = "h1:rQyoYtj4KddB3bxG6SAqd4+08gePNyJjRqvOIfV3rkM=",
version = "v1.5.7",
build_config = "@repository//:foo_dependencies.bzl",
)
...
Then foo_dependencies.bzl has repository directives and go_repositorys defined again.
# gazelle:repository go_repository name=org_golang_x_text importpath=golang.org/x/text
# gazelle:repository go_repository name=org_golang_x_tools importpath=golang.org/x/tools
# gazelle:repository go_repository name=org_golang_x_xerrors importpath=golang.org/x/xerrors
...
go_repository(
name = "com_github_containerd_containerd",
build_file_proto_mode = "disable_global",
importpath = "github.com/containerd/containerd",
sum = "h1:rQyoYtj4KddB3bxG6SAqd4+08gePNyJjRqvOIfV3rkM=",
version = "v1.5.7",
)
...
Then @repository//:foo_deps.bzl can be used in other repositories.
I hope that makes sense. @norbjd
Ooh I see, that's clever, thanks! :eyes:
Do you remember if you were able to generate those files automatically with Gazelle? foo_deps.bzl is similar to the one generated by default with gazelle update-repos (except the build_config), but what about foo_dependencies.bzl?
Obviously we can copy every go_repository manually in that file but it looks like something that would quickly become outdated in the future (like everything manually managed :sweat_smile:).
Thanks again!
I wasn't able to generate the files completely automatically. The amount of editing is fairly low using something like multi-cursor mode in emacs.
I completely agree that they will become outdated. Having actual labels in repository_macro would help.