setup-go icon indicating copy to clipboard operation
setup-go copied to clipboard

Caching should use go.mod, not go.sum

Open peterbourgon opened this issue 1 year ago • 10 comments

go.sum is an append-only log of checksums, used to verify the integrity of modules downloaded during builds. It's essentially a manifest file (shasums) and not any kind of lock file (Cargo.lock). It doesn't represent the dependencies of the corresponding module in any meaningful sense. This dependabot issue goes into more detail.

Cache keys for Go modules need to be based on the (normalized) content of go.mod, not go.sum, in order to be useful.

peterbourgon avatar May 09 '24 22:05 peterbourgon

Hello @peterbourgon, Thank you for creating this issue and we will look into it :)

aparnajyothi-y avatar May 10 '24 13:05 aparnajyothi-y

go.sum is an append-only log of checksums

Note: go.sum will be pruned as dependencies are removed if you run go mod tidy (from: https://go.dev/ref/mod#go-sum-files):

go mod tidy will add missing hashes and will remove unnecessary hashes from go.sum.

used to verify the integrity of modules downloaded during builds

Is this not a suitable for a file to be used as a cache key? if some new file needs to be downloaded that the cache should be updated to include that new file.

matthewhughes934 avatar Jun 04 '24 17:06 matthewhughes934

Is this not a suitable for a file to be used as a cache key? if some new file needs to be downloaded that the cache should be updated to include that new file.

Unfortunately not, no.

Again, go.sum isn't a lock file, and doesn't (necessarily) represent the actual dependencies used by the module. In fact, it doesn't even need to be committed! It exists purely to verify any dependencies fetched as part of the build process.

The go.sum file contains cryptographic hashes of the module’s direct and indirect dependencies ... The go.sum file may contain hashes for multiple versions of a module. The go command may need to load go.mod files from multiple versions of a dependency in order to perform minimal version selection. go.sum may also contain hashes for module versions that aren’t needed anymore.

Just use go.mod and the problem is solved.

And don't take my word for it: github.blog, etc.

peterbourgon avatar Jun 07 '24 20:06 peterbourgon

Hello @peterbourgon,

Thank you once again for creating this issue. We have analyzed using go.mod instead of go.sum for caching and identified the following key points:

  • The go.mod file declares a project's dependencies. Any changes here represent changes to the project's dependencies, necessitating a cache update.
  • The go.sum file logs all downloaded modules and includes checksums for integrity verification but doesn't represent the actual dependencies used by the project.

We will check the feasibility of the requested implementation and consider it as a feature request once we receive some feedback.

aparnajyothi-y avatar Aug 22 '24 04:08 aparnajyothi-y

Thank you!

peterbourgon avatar Aug 22 '24 14:08 peterbourgon

The caching performed by actions/setup-go is ineffective at caching gocache and gomodcache contents with my project and this may be one contributing factor. I stress that I don't know this for sure.

All I know is that when I cache the gocache directory myself using actions/cache, I benefit from significantly faster build, test and lint performance.

I do see evidence that setup-go is effectively caching some or all of gomodcache, so it seems my issue is mostly limited to gocache contents (which govern the behavior of go install, golangci-lint and go test).

xeger avatar Sep 09 '24 23:09 xeger

I also notice that when running go test or go build that many files are downloaded each time. I don't think that setup-go is effective. For people that are working at GitHub, do you have analytics about the github action where setup-go is present?

remyleone avatar Sep 10 '24 07:09 remyleone

El El mar, 10 de sep de 2024 a la(s) 1:30 a.m., Rémy Léone < @.***> escribió:

I also notice that when running go test or go build that many files are downloaded each time. I don't think that setup-go is effective. For people that are working at GitHub, do you have analytics about the github action where setup-go is present?

— Reply to this email directly, view it on GitHub https://github.com/actions/setup-go/issues/478#issuecomment-2339882984, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX4MPLDWJHAVTNMQDNFW2L3ZV2NYPAVCNFSM6AAAAABHPR27GCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZZHA4DEOJYGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Mago16 avatar Sep 10 '24 09:09 Mago16

@aparnajyothi-y Any updates on this?

kishaningithub avatar Oct 24 '24 12:10 kishaningithub

If you include go mod tidy and linting for its use, using go.sum (alongside go.mod) is better than just go.mod. The earlier linked blogpost just refers to case where tidy is not used and they care about really used mod versions.

If upstream or someone/something changes binary in cache without changing version, you will not see that without using also sum file in git.

Assuming you have both in git, and sum is tidied, then it becomes question if you want to support sum changing without mod changing. If so, you should use sum as cache key. Otherwise mod.

fingon avatar Feb 16 '25 05:02 fingon

Among other things, go.mod includes Go language and toolchain versions. It should be used instead of go.sum, which, as mentioned above, does not even need to be committed.

AlekSi avatar Dec 05 '25 07:12 AlekSi

If you do not commit go sum, you will not have consistent builds OR verifiable builds.

Correct answer is always to use both, as if one or the other changes, environment should be different than what it was.

fingon avatar Dec 05 '25 07:12 fingon

If you do not commit go sum, you will not have consistent builds OR verifiable builds.

I’m sorry, but that's simply incorrect. The set of dependencies is defined by the go.mod file, not go.sum. Checksums in most cases are verified by https://sum.golang.org or a compatible service. All that was described above.

AlekSi avatar Dec 05 '25 07:12 AlekSi

If you do not commit go sum, you will not have consistent builds OR verifiable builds.

I’m sorry, but that's simply incorrect. The set of dependencies is defined by the go.mod file, not go.sum. Checksums in most cases are verified by https://sum.golang.org or a compatible service. All that was described above.

That is simply untrue in the general case. If someone republishes e.g. version 1.0 of a package (with different hash), go.mod happily refers to it, and depending on when you run the go build (and what is in your local mod cache), you will either use old 1.0 or new 1.0.

This exactly what go.sum is added for to protect against (it stores the hash of what v1.0 really means at the time someone added it, as opposed to what is found later).

go.mod defines logical version dependencies (transitively), go.sum specifies exact versions of those dependencies used in the build to be consistent. (with the time they were added, and until forever)

If you think I am 'wrong', please describe how consistent reproducible builds work without go.sum.

fingon avatar Dec 05 '25 09:12 fingon

I already mentioned https://sum.golang.org in my previous comment, which you ignored. Please check it out.

and depending on when you run the go build (and what is in your local mod cache), you will either use old 1.0 or new 1.0.

The checksum is checked against https://sum.golang.org on module download and on go mod verify.

AlekSi avatar Dec 05 '25 12:12 AlekSi

and depending on when you run the go build (and what is in your local mod cache), you will either use old 1.0 or new 1.0.

The checksum is checked against https://sum.golang.org on module download and on go mod verify.

How does it work for private repositories?

Even for public repositories, there is also no guarantee that the moment the hash was fetched and stored by Google was the first time module was really available with the same tag, leading to potential mod cache inconsistencies between nodes.

Finally, not everyone uses Google service as go.sum contains the relevant bits and covers the 2 above things Google does not. (it is build option, not requirement, currently - and any network access slows you down.)

fingon avatar Dec 05 '25 12:12 fingon