consul icon indicating copy to clipboard operation
consul copied to clipboard

build: Adopt go.work and remove submodule replace statements

Open zalimeni opened this issue 1 year ago • 4 comments

Description

Adopt go.work and make supporting changes to synchronize dependencies across locally-replaced modules.

This provides several advantages:

  • replace statements can be removed from individual go.mod files
    • This reduces the toil of release steps for any release branch containing this change, since it takes away the requirement of commenting out replace in order to ensure go.sum contains entries for any locally-replaced module paths
    • A subtle but nice side-effect of ^ is that we no longer have to "orphan" commits in order to cut submodules; they can now be part of the normal release branch (.x) git history
    • Another nice side-effect of ^ is that our git history for submodules will be linear again; go get ...@latest/@upgrade won't produce extremely old-looking pseudo-versions when importing SHAs
  • Our dependencies across submodules will be kept in lockstep when upgrades are made, ensuring compatibility and avoiding drift

Considerations

I think this change warrants some consideration before merging! Here's some factors that we should weigh in our decision vs. the benefits noted above:

The go command will maintain a go.work.sum file that keeps track of hashes used by the workspace that are not in collective workspace modules’ go.sum files.

It is generally inadvisable to commit go.work files into version control systems, for two reasons:

  • A checked-in go.work file might override a developer’s own go.work file from a parent directory, causing confusion when their use directives don’t apply.
  • A checked-in go.work file may cause a continuous integration (CI) system to select and thus test the wrong versions of a module’s dependencies. CI systems should generally not be allowed to use the go.work file so that they can test the behavior of the module as it would be used when required by other modules, where a go.work file within the module has no effect.

That said, there are some cases where committing a go.work file makes sense. For example, when the modules in a repository are developed exclusively with each other but not together with external modules, there may not be a reason the developer would want to use a different combination of modules in a workspace. In that case, the module author should ensure the individual modules are tested and released properly.

  • I think the second bullet above is worth considering before making this change. In my view, Consul is primarily a monorepo, and secondarily a source of submodules. For that reason, I think you can argue that it is reasonable to consider go.work in CI, given that we use replace widely today for the same effect (using local code rather than published submodule tags), and we are not using go.work here to replace non-local dependencies.
  • Forcing lockstep on dependencies could be seen as a disadvantage in specific scenarios where we might want a submodule to break with others. I see this as unlikely for the submodules we publish intended for public consumption (api, sdk, proto-public, envoyextensions, troubleshoot).
  • https://github.com/golang/go/issues/53502 has a good amount of back-and-forth on the advantages and disadvantages of versioning a go.work file, which led to the warning above. IMO, we fall into the "monorepo w/ submodules" case, and by forcing go work sync in CI, we can sidestep the unexpected pitfalls that can come w/ the overriding power of go.work. Also, we aren't explicitly setting any versions in ours - just controlling replace statements.
  • For different reasons, Kubernetes recently adopted a versioned go.work. Less relevant to our use cases, but notable.
  • This is a fairly reversible decision, and many of the changes here are arguably independently useful to suss out weird edges and old dependencies, so I think it's feasible to trial it in main and make a final call closer to 1.21 LTS.

Change list

Because this involved synchronizing previously drifted dependencies, some additional changes were forced on this PR.

The full list of changes required to make this happen included:

  • add go.work and synchronize go.mod files
  • add go work sync check to CI and Makefile
  • remove replace statements (replaced by go.work)
  • fix tencentcloud ambiguous import by removing consul from test-sds-server and consul-container dependencies and upgrading go-discover
  • fix security scan by cloning scanner outside consul repo
  • fix broken otel exporter due to metrics upgrade
  • upgrade otel further to fix library compatibility with itself which borrows its decode package
  • fix consul-container lint for deprecated gRPC methods
  • fix test-sds-server Dockerfile
  • proto-gen following protoc bump to swap interface{} for any

Backport notes

I think it would be ideal if we backported the core go.work changes to our active release branches. This would make backports of future go.mod and go.work changes simpler. The cost of doing this is potentially repeating several of the steps noted above by hand, since the final go.mod and go.sum for earlier versions of consul will require a difference set of dependencies. The alternative is keeping this change limited to main, and waiting until 1.21 to get the desired impact on submodule releases.

Regardless, I think it's valuable to first kick the tires on this change a bit before backporting, and see 1) how well it handles dependency bumps and 2) how much/what sort of impact it has on backports.

Any reviewer feedback on this is welcome, and while I'm on PTO, please feel free to pick this PR up and run with it.

Post-merge followups

  • Update internal release guide to drop the replace statement shuffle and related notes about orphan commits, replacing with instructions to tag submodules directly from the .x release branch after dependency updates are made
  • Consider updating the reusable-get-go-version workflow to rely on go.work toolchain directive rather than .go-version for selecting the Go version for builds

PR Checklist

  • [ ] updated test coverage
  • [ ] external facing docs updated
  • [ ] appropriate backport labels added
  • [ ] not a security concern

zalimeni avatar Aug 26 '24 14:08 zalimeni

NTS: we might need to double-run make go-mod-tidy with a go work sync in between - it seems to keep ending up w/ extra changes after the first run. Most likely just need to run go work sync first always, since this modifies the other modules, but I'm not certain that has worked every time as I've made these updates.

zalimeni avatar Sep 18 '24 13:09 zalimeni

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Tagging a few folks who may have interest for review.

zalimeni avatar Oct 18 '24 17:10 zalimeni

Closing this PR due to priority.

jmurret avatar Nov 21 '24 20:11 jmurret

Closing this PR due to priority.

ACK. Thank you for reviewing while I was on leave!

zalimeni avatar Dec 10 '24 05:12 zalimeni