go icon indicating copy to clipboard operation
go copied to clipboard

cmd/go: add global ignore mechanism for Go tooling ecosystem

Open burdiyan opened this issue 4 years ago • 129 comments
trafficstars

UPDATE: The summary of the accepted proposal at: https://github.com/golang/go/issues/42965#issuecomment-1974089649.


Problem

For non-trivial (often multi-language) projects it's often desirable to make all the Go tools (including gopls) ignore certain directories.

Some of the examples could be the huge amount of files within node_modules, or bazel-* directories generated by Bazel. This causes many operations with ./... wildcards taking longer than desired. Also gopls often eats up a lot of CPU in VS Code depending on what you are doing.

Prior Art

This is something that has been discussed in several issues before, but seems like people couldn't get agree on a solution.

  • https://github.com/golang/go/issues/30058
  • https://github.com/golang/go/issues/35914
  • https://github.com/golang/go/issues/42473

Some tools started to have their own solutions which causes fragmentation and is cumbersome.

For example goimports have its own machinery for this - .goimportsignore file in this case. But it's not working with Go Modules.

Other tools have a hard-coded list of directories to ignore, like .git and so on.

It seems like having a global solution that all the Go ecosystem could understand would make sense to solve this kind of problem.

Recently a workaround for this was to place a dummy go.mod file in the directories you wanted to ignore. But this is not easily portable between users of the project, because often these directories can be re-created on the user's machine and aren't even checked-in. Asking people to sprinkle some go.mod files all around every time is cumbersome.

@robpike was against of creating more dot files (https://github.com/golang/go/issues/30058#issuecomment-475003231).

Proposed Solution

Here're some of the options that this could be implemented with.

  1. ~~Use go.mod file for specifying directories to ignore.~~ (Rejected because go.mod is not a catch-all config file like package.json in NodeJS).
  2. ~~Use a separate .goignore file.~~ (This would go against Rob's desire to avoid new dot files, and although being in the spirit with other tools: .dockerignore, .gitignore, .bazelignore, etc. is concerning. The concerns are discussed in this thread).
  3. Use the go.work file that's coming in the next Go 1.18 release.
  4. Have a separate go.ignore file that would specify directories to ignore.

/cc @tj @stamblerre

burdiyan avatar Dec 03 '20 08:12 burdiyan

But this is not easily portable between users of the project, because often these directories can be re-created on the user's machine and aren't even checked-in. Asking people to sprinkle some go.mod files all around every time is cumbersome.

I'm not sure that I understand this argument. Presumably, it's a program that creates and fills those directories, since they have to contain a significant amount of files for you to really want to ignore them in Go. If they were just a handful of files created manually by a human, it would be a negligible cost for Go to walk those and realise there are no Go packages there.

So, given that it is a program or script creating those large directories, why not add a touch ${dir}/go.mod at the end? That seems easy enough at a high level, at least.

I'm proposing to add this configuration into the existing go.mod file.

This is unlikely to happen, see https://github.com/golang/go/issues/42343#issuecomment-737406453.

Another solution could be a global .goignore file. This would go against Rob's desire to avoid new dot files, but would be in the spirit with other tools like that have files like .dockerignore, .gitignore, .bazelignore, etc.

I have to admit that I dislike this option. It's bad enough that all these other tools use separate ignore files.

mvdan avatar Dec 03 '20 09:12 mvdan

I'm not sure that I understand this argument. Presumably, it's a program that creates and fills those directories, since they have to contain a significant amount of files for you to really want to ignore them in Go. If they were just a handful of files created manually by a human, it would be a negligible cost for Go to walk those and realise there are no Go packages there.

So, given that it is a program or script creating those large directories, why not add a touch ${dir}/go.mod at the end? That seems easy enough at a high level, at least.

@mvdan It is indeed a program that creates these directories. But it's a program that you don't control normally. Wrapping well-known tools like nom install with your own script only to put an empty go.mod in there doesn't seem right.

On the other hand by placing arbitrary files in these directories you're invading the territory of other tools. What if that program checks the integrity of the directory and would break seeing a random unknown file? It's not the case with node_modules but breaking into structures created by other programs, only to work around your own problem doesn't seem right either.

I understand the objection about go.mod. I was not aware about @rsc's statement.

I have to admit that I dislike this option. It's bad enough that all these other tools use separate ignore files.

Could you elaborate on why do you think it's bad? It may not be the most elegant solution, but it's common practice, well-understood and somewhat expected.

If we already have .goimportsignore, why not standardizing it into something that can be handled and understood by all the ecosystem of Go tools?

burdiyan avatar Dec 03 '20 13:12 burdiyan

It is indeed a program that creates these directories. But it's a program that you don't control normally. Wrapping well-known tools like nom install with your own script only to put an empty go.mod in there doesn't seem right.

Wouldn't you need to wrap the tool to add a .goignore file anyway? (Given that you need to inject a file, why does it matter whether it is named .goignore or go.mod?)

bcmills avatar Dec 03 '20 15:12 bcmills

@bcmills My proposal is to add a file in the root of the project, not in the directory being ignored. So it would be checked in. Like .gitignore in Git. Basically the idea is to list the paths to ignore in that file, and check it in.

burdiyan avatar Dec 03 '20 16:12 burdiyan

...ok? But why would you not also check in the injected go.mod files?

bcmills avatar Dec 03 '20 16:12 bcmills

Could you elaborate on why do you think it's bad? It may not be the most elegant solution, but it's common practice, well-understood and somewhat expected.

The common practice is to litter repositories with dot files. That does not mean we should do the same, making the problem worse :) Go already has multiple mechanisms to ignore entire directories (. or _ prefixes, and dropping empty go.mod files), so there needs to be a really good reason to add another method.

mvdan avatar Dec 03 '20 16:12 mvdan

...ok? But why would you not also check in the injected go.mod files?

@bcmills because often directories to ignore aren't checked in.

burdiyan avatar Dec 03 '20 16:12 burdiyan

The common practice is to litter repositories with dot files. That does not mean we should do the same, making the problem worse :) Go already has multiple mechanisms to ignore entire directories (. or _ prefixes, and dropping empty go.mod files), so there needs to be a really good reason to add another method.

@mvdan IMHO, having a dot file in one place, that is trackable, is less of an evil, than sprinkling empty go.mod files all over the place, ad-hoc, and breaking into opinions of other tools.

Thinking about pros and cons of implementing such a feature, I'm struggling to see any cons (probably due to my ignorance), besides having to spend the time to implement it. I'd appreciate if anyone could bring some light into this to understand the implications.

burdiyan avatar Dec 04 '20 17:12 burdiyan

I have a use case where I have a multi-language repo where not all of the developers are touching the go components.

I don't think it is reasonable to ask my docker, bazel, and nodejs developers to all wrap their normal tooling in scripts that touch extra files in their build directories, nor ask them to try to rename their standard build directories to match existing go conventions, some of which conflict with the other tool conventions.

It seems like there should be a way to specify how to ignore certain files or directories that does not require modifying the content of those files or directories, because the ignored content is not being managed by go and may have its own conflicting conventions and lifecycle.

psigen avatar Mar 19 '21 03:03 psigen

@psigen Go wants a directory tree that belongs to it. In a multi-language repo, why not create a top-level go/ directory?

rsc avatar Aug 18 '21 18:08 rsc

@rsc: because my projects are not organized that way. I have services like:

service1/
    backend/  # golang
    debug-cli/
    proto/
service2/
    backend/  #python
    debug-ui/
    proto/
webapp/
    frontend/ 

I know what you are asking, which is why not reorganize to:

proto/
    service1/
    service2/
golang/
    service1-svc/
    service1-cli/
python/
    service2/
nodejs/
    webapp/
    service1/debug-ui/

And the answer, (besides "that's a lot of work right now") is that it is not how our ownership is structured.

It is not convenient to have duplication in the CODEOWNERS files, .gitignore patterns that look like **/service1/foo*, cross-directory-tree docs links, etc. all in service of golang. It makes PR reviews harder when related changes happen all over the directory tree. It forces docker build contexts to all need to be at the root of the entire source tree, and makes live-rebuilds in tools like Tilt and Skaffold much more difficult to author.

I could go on, but I'm really just reiterating the core premise of this proposal:

For non-trivial (often multi-language) projects it's often desirable to make all the Go tools (including gopls) ignore certain directories.

psigen avatar Aug 21 '21 01:08 psigen

I use the serverless framework to deploy lambdas on AWS. Some plugins that I have to use contains go files. So when I run go mod download or go mod tidy I add dependencies to my go.mod file that are required by the go files inside the node_modules directory. It would be great to define a way to exclude directories from go modules.

node_modules
  serverless
    lib
      plugins
        create
          templates
            aws-go
              main.go
cmd
  main.go
pkg
  dirA
  dirB
  dirC

4k1k0 avatar Sep 03 '21 17:09 4k1k0

@rsc

@psigen Go wants a directory tree that belongs to it. In a multi-language repo, why not create a top-level go/ directory?

Reading this statement does not make me happy. It goes against the entire premise of the code organization at my company.

We use a monorepo with projects in several languages. Some Go programs live inside projects written in other languages. Sometimes we rewrite a project from one language to another. Some projects use a combination of Go and other languages (imagine a website written using both Go and JavaScript extensively). We already have an organizational hierarchy within the monorepo that is based around purpose and ownership, not language.

This all worked fine with $GOPATH: the repo was inside its own single $GOPATH segment and a top-level vendor directory contained a single version of all shared dependencies.

Moving to modules has raised some challenges, but mostly it has worked. The whole repo is one module so we use a fixed, shared set of dependencies. One issue we faced is that the go mod tidy and other commands printed out a bunch of irrelevant spam (#35941) -- we sent a fix for that. Other issues we see involve various kinds of slowness in gopls (#46438 describes one particular issue I have). But mostly it works fine, and ISTM that the remaining issues are surmountable if the folks working on the tools care about making them work well in the presence of mixed-language source trees (and until now it seemed to me that they mostly do!).

But when I read "Go wants a directory tree that belongs to it", it sounds like you don't think this use case matters as far as the standard Go tools are concerned. I don't know how we could possibly adapt our repo to a "Go code all belongs in its own tree" model. Probably we wouldn't -- I imagine that if push came to shove, we'd look into alternative build tools.

cespare avatar Sep 03 '21 23:09 cespare

In #50225 I'm bring in concerns about the resources (network, disk space) wasted on every developers machine because the module zips contain many irrelevant files.

Check this list of files that are in your Go modules cache:

find $(go env GOMODCACHE)/*.* -type f ! -name '*.go' ! -name 'go.mod' ! -name 'go.sum' ! -name 'list.lock' ! -name 'v*.mod' ! -name 'v*.info' ! -name 'v*.zip' ! -name 'v*.ziphash' ! -name 'v*.lock' ! -name 'LICENSE*' ! -name 'README*' -print

I have more than 200,000 useless files on my machine.

This also impacts CI builds (download time/space of new dependencies, requires to enable strong module caching to reduce the problem).

dolmen avatar Dec 16 '21 18:12 dolmen

While similar, I think this proposal is a bit different from yours @dolmen in a sense that here I mostly care about ignoring directories, not specific files, and definitely not for specific packages like x/mod/zip. Still, it could be the same solution for solving both problems.

burdiyan avatar Dec 17 '21 09:12 burdiyan

BTW, go.work is coming in the next Go release. Maybe this feature could be implemented in there eventually? Or maybe a separate go.ignore file? Looks like a better approach than .goignore for sure!

I updated the initial comment.

burdiyan avatar Dec 17 '21 09:12 burdiyan

BTW, go.work is coming in the next Go release. Maybe this feature could be implemented in there eventually?

I consider go.work as a development tool for your local development environment. Which means it is a file I would not commit in the repo.

Instead, ignore patterns must be available for tools that download the code from a VCS (for publishing on a proxy, or for filling the module cache, see #50225), so the ignore patterns must be always available in the repository.

So go.work would not be a good place for ignore patterns.

dolmen avatar Dec 17 '21 11:12 dolmen

@dolmen While I suspect that go.work is meant to be checked-in (I'm not sure about it), I think you're right that the ignore stuff should probably be in a separate place, because not all project would want to have go.work. Then maybe go.ignore is the remaining option that would make some people happy, and the rest (those who don't like the idea of dot files) at least not angry about it :)

burdiyan avatar Dec 18 '21 23:12 burdiyan

@burdiyan, go.work is indeed not meant to be checked in:

These go.work files should not be checked into the repositories so that they don‘t override the workspaces users explicitly define. Checking in go.work files could also lead to CI/CD systems not testing the actual set of version requirements on a module and that version requirements among the repository’s modules are properly incremented to use changes in the modules. And of course, if a repository contains only a single module, or unrelated modules, there's not much utility to adding a go.work file because each user may have a different directory structure on their computer outside of that repository. — Proposal: Multi-Module Workspaces in cmd/go §Multiple modules in the same repository that depend on each other

antichris avatar Dec 19 '21 13:12 antichris

It is kind of frustrating that the responses from Go contributors are uniformly "Everyone else on earth is wrong, they should change to accommodate our design choices."

No matter how inelegant another dotfile is, it solves the problem in a universal way that will work for all repository structures and build tools. None of the proposed alternatives even attempt to do the same.

I currently just don't run gopls and try to minimize how often I have to write Go, which is not a "solution" that is available to everyone.

jaronsummers avatar Feb 04 '22 11:02 jaronsummers

Some of us discussed the problem this proposal aims to address - i.e., allow to exclude certain directories when running go with patterns including ....

We agree this is a problem for some tools (e.g. gopls, and others that accept go's import path patterns). Many tools developed their own ways of configure exclusion rules (e.g. gopls has directoryFilter) but this is still not sufficient if they depend on go invocation with ... pattern underneath.

@bcmills had a great idea during the discussion - go already has the overlay mechanism (see the summary of the feature by @matloob and also the -overlay flag description in go command help page). That can be used as the directory exclusion mechanism. For exclusion, place an empty value; for inclusion, set identity mapping. gopls can implement this by applying already existing directoryFilter, and I guess other tools can do the same. (x/tools/go/packages supports overlay)

The overlay config isn't as flexible as glob patterns many dotfiles accept, but I think it still provides the sufficient knob tools can play with. What do you think?


#50225 (for mechanism to fine tune the scope of a module) was mentioned during the discussion, but I don't think that is the goal of this proposal. For example, I think it's possible one wants to speed up gopls by excluding a directory but want to still keep it in the distributed module (directories containing asset files, etc) or the directory doesn't affect module distribution at all (ephemeral directories such as node_modules or bazel directories created during build).

@jaronsummers I think the Go team is trying to understand the problem better, not dismiss or ignore problems users are facing in the real world.

hyangah avatar Feb 18 '22 14:02 hyangah

I'm sure this was already mentioned here, but for clarity:

The most simple example case of this is if you have node_modules which happens to have any Go code in it. When running "go mod tidy" the Go files in node_modules are scanned & included in go.mod. ignoring node_modules would be the most obvious application of some .goignore feature.

paralin avatar Jun 16 '22 02:06 paralin

it was mentioned in the past but adding ignore support to go.mod would be flexible, no hardcoded rules, and no new magic files.

amery avatar Jun 16 '22 10:06 amery

@amery

it was mentioned in the past

And it was already rejected in the past:

... because go.mod is not a catch-all config file like package.json in NodeJS — https://github.com/golang/go/issues/42965#issue-755968588

antichris avatar Jun 16 '22 10:06 antichris

@antichris every solution has been rejected because developers don't recognize the problem. go.mod is not a catch-all and .goignore is.. another file

amery avatar Jun 16 '22 11:06 amery

I'd vote for go.ignore! It's another file, but it's not a dot-file, which was the main concern of Rob, and others I believe.

burdiyan avatar Jun 20 '22 08:06 burdiyan

I'd vote for go.ignore! It's another file, but it's not a dot-file, which was the main concern of Rob, and others I believe.

as long as it can be used to specify patterns to ignore I'm happy

amery avatar Jun 20 '22 08:06 amery

How about a go env variable, e.g. GOIGNOREFILE, that could be set to point to an arbitrary ignore pattern file with .gitignore-compatible syntax?

Users could then assign GOIGNOREFILE=go.ignore (or .goignore or even .gitignore), if they chose to do so, and no new "another" (dot or not) file is forced on anyone out of the box. It would also be possible to settle on an OOtB default value for it eventually, without breaking established workflows, yet providing a much needed relief in the meantime.

antichris avatar Jun 20 '22 09:06 antichris

Surprised there wasn't any devops use cases yet, so I'll step in.

Many great projects are based on Go, among which are - docker, kubernetes, helm and terraform. It is only so natural that many of us dealing with these tools on a daily basis becoming fluent with Go over time, and starting to use it more. Particular example of that is Terratest - a great framework that is used to write integration tests for Dockerfile, helm charts and terraform modules. Terratest itself is a Go module so the tests are executed with go test ./....

What that means is that every helm chart or terraform module repository is initialized as a Go module, despite that it doesn't have any Go code (other than tests).

I saw somewhere a mention that Go excludes folders named with . in the front. I am not sure how but it doesn't really happening. Maybe it is just gopls specific issue - I don't know. Terraform creates .terraform folder under which it creates sub folders and checking out other modules this module depends on. My VScode in gopls mode consumes a lot of CPU and generates a lots of warning for duplicated modules because TF modules in my VScode workspace indeed are duplicated under .terraform folders of other modules that are using them. This makes the whole setup so painful. And what if some other tool like terraform were to use a folder without . in the front, just like npm, and there wasn't a way to modify its behavior?

This is such an easy feature to add, that this discussion spread among multiple tickets already consumed x10 times of everyone's time than would otherwise take one single person to just implement it. I hope my struggle is not for Go developers arrogance and they indeed trying to understand the problem, but somehow I find it hard to believe. I feel like to address this issue we first will need a help of a licensed therapist that would conduct a series of sessions with the maintainers and help them understand that despite their awesomeness and undeniable historical contribution to the humanity legacy - the universe does not spin around them and there are other tools in existence that Go needs to peacefully coexist with.

dee-kryvenko avatar Jun 23 '22 03:06 dee-kryvenko

@dee-kryvenko Please be charitable and respectful, per the Go Community Code of Conduct. Criticize the arguments, not the people. Thanks.

ianlancetaylor avatar Jun 23 '22 03:06 ianlancetaylor