cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Improve CoreDNS support and validation in KubeadmControlPlane

Open wfernandes opened this issue 4 years ago • 24 comments

User Story

As part of issue https://github.com/kubernetes-sigs/cluster-api/issues/2545, we added a dependency on github.com/coredns/migration. It was added in PR: https://github.com/kubernetes-sigs/cluster-api/pull/2574.

This library needs to be kept up to date if we are to support future coredns version upgrades so we will need automation to check for upgrades in this library and update our modules. OR We decouple ourselves from this library somehow. 🙂

See comment below from @detiber:

I'm wondering if we should think about some type of automated checking to see if it is out of date, since the current implementation will require us to cut a new release if a newer version of CoreDNS is released prior to us being able to support it (based on the version comparison check)

https://github.com/kubernetes-sigs/cluster-api/pull/2574/files/9cb95d52d539b8b3109951909588388781ba7acd#r389755735

/cc @vincepri @ncdc /kind feature

wfernandes avatar Mar 09 '20 17:03 wfernandes

We should look into switching over to using the CoreDNS operator as soon as it's mature and stable.

ncdc avatar Mar 11 '20 17:03 ncdc

@stealthybox @neolit123 - What's the state of the addons, and the latest CoreDNS operator?

timothysc avatar Mar 13 '20 18:03 timothysc

The CoreDNS operator needs a controller runtime v2 update, but it should be available in the near future.

This will help us extract the migration logic out of capi and kubeadm. But the version binding would remain in any form, since consumers would need to track operator versioning.

Benefits would come if A) we can track the latest version of the operator B) old versions of the operator can install and migrate newer CoreDNS.

neolit123 avatar Mar 13 '20 23:03 neolit123

Where is the coredns operator code hosted today?

vincepri avatar Mar 16 '20 16:03 vincepri

there is a coredns operator here: https://github.com/kubernetes-sigs/cluster-addons/tree/master/coredns but AFAIK this is not the operator that the coredns maintainers created.

i also see there is a related GSoC task for this opened 5 days ago that includes ConfigMap updates: https://github.com/kubernetes-sigs/cluster-addons/issues/47 cc @johnsonj

@rajansandeep mentioned to me that he is personally working on the operator, so i do not know how/if this intersects with GSoC and if there is source code in another repository.

is the GSoC task supposed to be completed by a student or is this adjacent work? my personal preference would be to delegate the main body of work away from GSoC - i.e. have a working version with CRv2 ideally sooner than end-of-GSoC.

neolit123 avatar Mar 16 '20 17:03 neolit123

A couple of questions I had:

  • Will this CoreDNS Operator be installed and/or managed by clusterctl?
  • What is GSoC?

wfernandes avatar Mar 16 '20 18:03 wfernandes

For v0.3.x we might need a different issue to add a verification script to update to the latest version of the module, if it's available

vincepri avatar Mar 16 '20 18:03 vincepri

Will this CoreDNS Operator be installed and/or managed by clusterctl?

unclear to me. would make sense to install it in an "addons phase" of sorts.

What is GSoC?

google summer of code.

neolit123 avatar Mar 16 '20 18:03 neolit123

Thanks for the cc @neolit123 - I'll discuss this in cluster-addons and with @rajansandeep

johnsonj avatar Mar 17 '20 15:03 johnsonj

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Jun 15 '20 15:06 fejta-bot

/lifecycle frozen

vincepri avatar Jun 15 '20 15:06 vincepri

/milestone v0.4.x

vincepri avatar Feb 19 '21 17:02 vincepri

/milestone Next

vincepri avatar Oct 19 '21 14:10 vincepri

/milestone v1.2 /retitle Improve CoreDNS support and validation in KubeadmControlPlane /assign @sbueringer

  • [x] Automate the version bump of the coredns migration library (somehow)
  • [ ] Validate that users won't be able to set a version higher than the maximum supported within the library
  • [ ] Validate that the version has a prefix (v)

vincepri avatar Feb 11 '22 17:02 vincepri

xref: https://github.com/kubernetes-sigs/cluster-api/issues/4463 (there's a bunch of information in this issue)

sbueringer avatar Feb 11 '22 17:02 sbueringer

  • [x] we should double-check if migration.ValidUpMigration actually works as expected. In our upgrade tests there seem to be cases where we migrate from v1.8.0 => v1.8.4 which should not be supported according to ValidUpMigration

sbueringer avatar Mar 07 '22 15:03 sbueringer

@sbueringer I'll take a look at this. - In which case shouldn't we be allowed to go to v1.8.4 ?

killianmuldoon avatar Mar 09 '22 15:03 killianmuldoon

@killianmuldoon I think we kind of should, but it looks like the migration might have a problem migrating the CoreDNS config then. (based on ValidUpMigration it seems to migrate from one version to another, so we would need v1.8.0=>v1.8.1=>v1.8.2=>v1.8.3=>v1.8.4)

There's an error somewhere in those assumptions, but I don't know where and we should find it out :)

sbueringer avatar Mar 09 '22 16:03 sbueringer

/assign

killianmuldoon avatar Mar 09 '22 17:03 killianmuldoon

We realized that I misread ValidUpMigration

sbueringer avatar Mar 09 '22 17:03 sbueringer

@vincepri https://github.com/kubernetes-sigs/cluster-api/pull/6406 is a solution to the first item on your list:

  • [x] Automate the version bump of the coredns migration library (somehow)
  • [ ] Validate that users won't be able to set a version higher than the maximum supported within the library
  • [ ] Validate that the version has a prefix (v)

killianmuldoon avatar Apr 11 '22 15:04 killianmuldoon

/triage accepted

fabriziopandini avatar Sep 30 '22 19:09 fabriziopandini

/help /unassign @killianmuldoon

fabriziopandini avatar Sep 30 '22 19:09 fabriziopandini

@fabriziopandini: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help /unassign @killianmuldoon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Sep 30 '22 19:09 k8s-ci-robot

(doing some cleanup on old issues without updates) /close we now have dependabot helping us in keeping up with this dependency, and I'm not aware of issues about recent version validation so I'm closing this for now

fabriziopandini avatar Mar 24 '23 16:03 fabriziopandini

@fabriziopandini: Closing this issue.

In response to this:

(doing some cleanup on old issues without updates) /close we now have dependabot helping us in keeping up with this dependency, and I'm not aware of issues about recent version validation so I'm closing this for now

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 24 '23 16:03 k8s-ci-robot