cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Importing API types pulls in lots of dependencies

Open JoelSpeed opened this issue 2 years ago • 45 comments

What steps did you take and what happened?

Create a simple program, as below and run go mod tidy:

package main

import (
	"fmt"

	capiv1beta1 "sigs.k8s.io/cluster-api/api/v1beta1"
)

func main() {
	myCluster := &capiv1beta1.Cluster{}

	fmt.Printf("My Cluster: %+v\n", myCluster)
}

This results in a go.mod with 56 dependencies. Included in this list is controller-runtime which, is also imported by a lot of consumers of the API types. controller-runtime has evolved a lot over time and, because it is being imported by the APIs currently, means that any project consuming the API types must also import the same controller-runtime as the rest of CAPI.

There are use cases where, building a consumer, I do not need to use any of the code (defaulting, validation etc) from CAPI and as such, do not need such a restriction on my controller-runtime version.

If we modify the way the SchemeBuilder works and move the webhooks to a separate package, then we can reduce the required imports for the API package down to just 31 dependencies, and, these dependencies do not include controller-runtime or other fast moving/changing packages. There are still obviously a good number of deps in there from K8s paths, but these are mostly API machinery related and have pretty stable APIs.

I think if we can separate the webhooks out in this way (I notice Cluster and ClusterClass already are?) then I think this would make consuming these API types in other projects a lot easier.

Note how k8s.io/api does not include much code with the API types for this reason as well.

The alternative solution for consumers is to copy/paste the types out to their own repos which is less ideal.

What did you expect to happen?

API types should not be tied to many dependencies, especially not controller-runtime.

Cluster API version

main branch

Kubernetes version

N/A

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

TODO

  • [x] Update schemebuilder in ./api (#9045)
  • [x] Update schemebuilder in ./controlplane/kubeadm/api (#9266)
  • [x] Update schemebuilder in ./bootstrap/kubeadm/api (#9266)
  • [x] Update schemebuilder in ./exp/api (#9185)
  • [x] Update schemebuilder in ./exp/addons/api (#9185)
  • [x] Update schemebuilder in ./exp/ipam/api (#9185)
  • [x] Update schemebuilder in ./exp/runtime/api (#9185)
  • [x] Update schemebuilder in ./cmd/clusterctl/api (#9266)
  • [x] Update schemebuilder in ./test/infrastructure/docker/api (#9266)
  • [x] Update schemebuilder in ./test/infrastructure/docker/exp/api (#9266)
  • [x] Update schemebuilder in ./test/infrastructure/inmemory/api (#9266)
  • [x] Move webhooks for ./api (#9047)
  • [x] Move webhooks for ./controlplane/kubeadm/api (#9410)
  • [x] Move webhooks for ./bootstrap/kubeadm/api (#9410)
  • [x] Move webhooks for ./exp/api (#9417)
  • [x] Move webhooks for ./exp/addons/api (#9438)
  • ~[ ] Move webhooks for ./exp/ipam/api~ (N/A)
  • ~[ ] Move webhooks for ./exp/runtime/api~ (N/A)
  • ~[ ] Move webhooks for ./cmd/clusterctl/api ~ (N/A)
  • [x] Move webhooks for ./test/infrastructure/docker/api (#9458)
  • [x] Move webhooks for ./test/infrastructure/docker/exp/api (#9460)
  • [x] Move webhooks for ./test/infrastructure/inmemory/api (#9459)
  • [ ] Add verification to avoid regressions (e.g. https://github.com/kubernetes-sigs/cluster-api/pull/9482#issuecomment-1730060650)
  • [ ] Ensure docs for provider implementation explain schemebuilder conventions
  • [ ] Ensure docs for provider implementation explain separate webhooks conventions
  • [x] Add import-boss to prevent regressions
    • [x] ./api #9407
    • [x] enforce the restrictions in all packages (as soon as we can) (@Ankitasw) #9461

JoelSpeed avatar Jul 18 '23 18:07 JoelSpeed

/triage accepted

vincepri avatar Jul 18 '23 18:07 vincepri

This all sounds good, a few additional comments would be:

  • We should have a clear way on how to structure webhooks in our API packages across Cluster API and its ecosystem of providers
  • Separate the SchemeBuilder changes from the webhook ones
  • Open an issue in Kubebuilder to potentially change the default way new projects or packages are generated

Thoughts? cc @CecileRobertMichon @fabriziopandini @sbueringer

vincepri avatar Jul 18 '23 18:07 vincepri

We should have a clear way on how to structure webhooks in our API packages across Cluster API and its ecosystem of providers

Agreed, I don't have context on why Cluster and ClusterClass are separate, but, their pattern works well. Do you think we need to have versioned defaulting/validation? If so, does it make sense to have the webhooks in versioned packages as in my POC branch?

If they don't need to be versioned, having them centrally as for Cluster and ClusterClass would work.

Separate the SchemeBuilder changes from the webhook ones

Ack yes, I can make a PR for this separately so it's considered as a different part of the same effort


I think it would be good to have a consistent approach across CAPI and the related repos, if there is a pattern we can come up with that we can document then I would like to also pursue updating the provider APIs to mitigate this issue there as well

JoelSpeed avatar Jul 18 '23 18:07 JoelSpeed

Agreed, I don't have context on why Cluster and ClusterClass are separate, but, their pattern works well. Do you think we need to have versioned defaulting/validation? If so, does it make sense to have the webhooks in versioned packages as in my POC branch?

The reason why Cluster and ClusterClass are separate is the following:

  • we couldn't just implement the Validate / Default methods on the types directly because we needed a client in those webhooks
  • So we already had to use CustomDefaulter / CustomValidator
  • I didn't want to add even more dependecies to our API package so I suggested to move them to a separate package

I agree that we can just follow the same pattern for all webhooks.

If they don't need to be versioned, having them centrally as for Cluster and ClusterClass would work.

The validate / default funcs / webhooks don't have to be versioned. Whenever we introduced a new apiVersion we moved the methods from the previous apiVersion to the new one :).

So overall sounds good to me. I'm absolutely in favor!

sbueringer avatar Jul 18 '23 18:07 sbueringer

+1

Agree anything we can do to limit importing controller-runtime only where necessary is an improvement

CecileRobertMichon avatar Jul 18 '23 22:07 CecileRobertMichon

+1, thanks for creating this Joel. I find having the ability to vendor the API cleanly critical for consumers and to ease adoption of the project. I think reducing the gomod hard dep on webhooks and solve it generically makes total sense and is a step in the right direction. For ref old discussion on this topic.

enxebre avatar Jul 19 '23 07:07 enxebre

We discussed this issue at the backlog grooming on Oct 29th 2021, the consensus was that for now we're not able to fully decouple our API types from the controller runtime dependencies due to conversion webhooks which are strictly tied to our types. https://github.com/kubernetes-sigs/cluster-api/issues/3465#issuecomment-954932969

Just to double-check. I think I might have brought this up at the time. If I now take a look at our current code this doesn't seem to be an issue as as we only depend on "sigs.k8s.io/controller-runtime/pkg/conversion" (for conversion.Hub) in old apiVersions. So it won't be a problem for cleaning up dependencies of the latest apiVersion (which I assume is our main goal).

sbueringer avatar Jul 19 '23 08:07 sbueringer

+1 from my side, thanks for reviving this discussion @JoelSpeed!

fabriziopandini avatar Jul 19 '23 10:07 fabriziopandini

I did a bit of work on the code for this today and came across a small issue.

Currently the package internal/test/builder imports the defaulting functions from clusterv1, if we move these to webhooks, it creates an import cycle because the tests are importing the internal/test/builder to run tests on the webhooks.

I can see 3 possible paths forward:

  • Keep the Default() functions inside the clusterv1 package and require that they do not use any import from controller-runtime (not ideal, don't know that this is even possible)
  • Remove the defaulting bit from the builder (it's for MHC and MD) and handle the defaulting in the tests where those are needed (there are 9 instances for MHC, 1 for MD)
  • Try to move the webhook tests to a separate go package, webhooks_test, which may solve the issue as well (loses access to private functions)

Any preferred method from those or alternative suggestions from anyone?

JoelSpeed avatar Jul 19 '23 13:07 JoelSpeed

Remove the defaulting bit from the builder (it's for MHC and MD) and handle the defaulting in the tests where those are needed (there are 9 instances for MHC, 1 for MD)

From a quick look it should be relatively easy to move the defaulting to the tests themselves rather than the builders. There's only a few instances of this usage. @chrischdi - I think you added this when moving these tests to envtest, is there something I'm not thinking of that could stand in the way?

killianmuldoon avatar Jul 19 '23 13:07 killianmuldoon

Remove the defaulting bit from the builder (it's for MHC and MD) and handle the defaulting in the tests where those are needed (there are 9 instances for MHC, 1 for MD)

Going to play with this idea then, will report back if I hit more issues. Expect some PRs over the coming days

JoelSpeed avatar Jul 19 '23 13:07 JoelSpeed

If I got it right we already had and solved this issue before. We have a separate test package here for those kind of tests: https://github.com/kubernetes-sigs/cluster-api/tree/3e16abc3df71314bbaee770846beba45c0eb170d/internal/webhooks/test

EDIT: My bad. That's a separate issue, but maybe you'll hit that one as well :). Agree with Killian for your current problem EDIT2: Okay not I'm entirely sure, maybe the solution works the same way

sbueringer avatar Jul 19 '23 18:07 sbueringer

Remove the defaulting bit from the builder (it's for MHC and MD) and handle the defaulting in the tests where those are needed (there are 9 instances for MHC, 1 for MD)

From a quick look it should be relatively easy to move the defaulting to the tests themselves rather than the builders. There's only a few instances of this usage. @chrischdi - I think you added this when moving these tests to envtest, is there something I'm not thinking of that could stand in the way?

Nope, looks good. At the time of implementing it was just the most straightforward one. Should be totally fine to default at the test and remove that from the builder 👍

@sbueringer because we just chatted about the depguard linter: maybe that one would be useful if the config is flexible enough, to prevent regressions if this issue got solved 🙂

chrischdi avatar Jul 20 '23 05:07 chrischdi

I've just posted two PRs to show the progress I've made so far on the main Cluster v1beta1 APIs, the first of which for scheme builder should be ready for review, the second I think is ready for review apart from I know I have to sort the MachineDeployment defaulting in tests, otherwise I think that should be fairly good to go

JoelSpeed avatar Jul 21 '23 15:07 JoelSpeed

Both PRs are now ready for review and, combined, clear the deps for the main APIs (Machine, MS, MD, Cluster etc). Will start to look at the exp APIs next

JoelSpeed avatar Jul 27 '23 12:07 JoelSpeed

Have added a TODO list with my initial thoughts on scope here, any additional points we need for this issue?

JoelSpeed avatar Aug 01 '23 09:08 JoelSpeed

Seems ~ good to me. Does your list cover all those APIs?

stable ./api ./controlplane/kubeadm/api ./bootstrap/kubeadm/api

exp ./exp/api ./exp/addons/api ./exp/ipam/api ./exp/runtime/api

clusterctl ./cmd/clusterctl/api (only one usage of scheme.Builder)

test ./test/infrastructure/docker/api ./test/infrastructure/docker/exp/api ./test/infrastructure/inmemory/api

Totally fine to focus on stable + exp + maybe clusterctl (because it's a quickwin) and then create a follow-up issue for test (I would mostly migrate test as well for consistency across the entire code base)

P.S. I'll try to find some time to review your PRs soon. Pretty low bandwidth at the moment

sbueringer avatar Aug 02 '23 16:08 sbueringer

I've updated the list to be very explicit, will keep track of the PRs in the list as we go along as well

JoelSpeed avatar Aug 03 '23 10:08 JoelSpeed

@killianmuldoon Do you have some time to replicate this issue for CAPV and work on it going forward? (implementation is not urgent, won't make it / doesn't have to make it into CAPV v1.8)

sbueringer avatar Aug 03 '23 13:08 sbueringer

@killianmuldoon Do you have some time to replicate this issue for CAPV and work on it going forward? (implementation is not urgent, won't make it / doesn't have to make it into CAPV v1.8)

By that you mean removing the controller-runtime package from the CAPV API packages?

killianmuldoon avatar Aug 03 '23 13:08 killianmuldoon

By that you mean removing the controller-runtime package from the CAPV API packages?

Basically implementing the same in CAPV that Joel is implementing in core CAPI. So yes, if I got your question correctly :) (although what Joel is implementing only removes the dependency from the v1beta1 packages, but that's enough)

sbueringer avatar Aug 03 '23 14:08 sbueringer

Sure thing - I'll create an issue and assign it to myself.

killianmuldoon avatar Aug 03 '23 15:08 killianmuldoon

We are planning to do the same with the CAPI operator. Webhooks are already in a separate package. We have a dependency on controller runtime in the API itself that will be removed in the new API version. This change will definitely simplify consuming CAPI/CAPI operator programmatically.

alexander-demicev avatar Aug 07 '23 11:08 alexander-demicev

Just found this tool https://pkg.go.dev/k8s.io/code-generator/cmd/import-boss

At a first glance this could help us to guard against regressions

sbueringer avatar Aug 10 '23 05:08 sbueringer

+1 to add import-boss as a final step of this effort

fabriziopandini avatar Aug 13 '23 10:08 fabriziopandini

Added a new PR to move the schemebuilder over for the experimental APIs, https://github.com/kubernetes-sigs/cluster-api/pull/9185

JoelSpeed avatar Aug 14 '23 15:08 JoelSpeed

Another PR to remove the final reliance on scheme builder stuff, https://github.com/kubernetes-sigs/cluster-api/pull/9266

JoelSpeed avatar Aug 22 '23 16:08 JoelSpeed

Talked to @Ankitasw, she'll take over ./bootstrap/kubeadm/api. Thank you!

sbueringer avatar Sep 13 '23 11:09 sbueringer

Talked to @Ankitasw, she'll take over ./bootstrap/kubeadm/api. Thank you!

I have got that in the PR #9410 already. If needed, I will just remove the commit.

odvarkadaniel avatar Sep 13 '23 13:09 odvarkadaniel

No worries, I can pick ./exp/api if ./bootstrap/kubeadm/api is already covered. cc @sbueringer

Ankitasw avatar Sep 13 '23 13:09 Ankitasw