cluster-api icon indicating copy to clipboard operation
cluster-api copied to clipboard

Ignition v3.x support

Open bengentil opened this issue 2 years ago • 20 comments

What would you like to be added (User Story)?

As a user I would like to bootstrap Flatcar & FCOS with Ignition v3.x as Ignition v2.x doesn't work with FCOS and is only required for flatcar LTS release.

Detailed Description

The current implementation (Container Linux Config aka clc) is used when setting format to "ignition" in KubeadmConfigSpec. To be able to support both v2.x & v3.x, a new field needs to be added to know which transpiler should be used:

  • clc for v2.x
  • butane for v3.x

Like in the current implementation, an additional butane config may be provided to extend/override the generated Ignition config.

Anything else you would like to add?

This issue follows this slack conversation: https://kubernetes.slack.com/archives/C8TSNPY4T/p1688643597808979

Label(s) to be applied

/kind feature /area bootstrap /area provider/bootstrap-kubeadm

bengentil avatar Aug 09 '23 17:08 bengentil

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 09 '23 17:08 k8s-ci-robot

/priority important-soon

vincepri avatar Sep 01 '23 14:09 vincepri

I've been looking into this a little and concluded that resolving this should be relatively straight forward, at least, on the providers side.

Looking at AWS as an example. The only thing it does for ignition is to upload the existing ignition data to s3 and then create a stub ignition which points to the s3 bucket.

There are minor differences between v2 and v3, but, it's not hard to create a switch and support this by importing both the v2 and v3 types.

For kubeadm this may be a little bit more involved.

Does anyone know about the flatcar ignition fork vs the coreos ignition? I was looking at both today and as far as I can tell they are API compatible, but the coreos fork seems to be more active at the moment?

JoelSpeed avatar Sep 04 '23 10:09 JoelSpeed

The PR addressing this issue is currently waiting on progress with https://github.com/kubernetes-sigs/cluster-api/issues/5294

see https://github.com/kubernetes-sigs/cluster-api/pull/9158#issuecomment-1676316931 for more details

bengentil avatar Sep 04 '23 14:09 bengentil

Does anyone know about the flatcar ignition fork vs the coreos ignition? I was looking at both today and as far as I can tell they are API compatible, but the coreos fork seems to be more active at the moment?

I think @pothos might know the answer.

johananl avatar Sep 06 '23 09:09 johananl

The forked repo is only used by Flatcar LTS. Flatcar Stable uses the CoreOS upstream repo (with some downstream patches to also support Ignition spec v2).

pothos avatar Sep 06 '23 12:09 pothos

Thanks @pothos, based on that, I think we should make sure we move back to the upstream coreos repo to import the types rather than using the forks, thanks for the context

JoelSpeed avatar Sep 06 '23 12:09 JoelSpeed

We should consider this ☝️ in the context of https://github.com/kubernetes-sigs/cluster-api/issues/5294.

johananl avatar Sep 06 '23 13:09 johananl

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 27 '24 18:01 k8s-triage-robot

/remove-lifecycle stale

johananl avatar Jan 29 '24 11:01 johananl

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 28 '24 11:04 k8s-triage-robot

/remove-lifecycle stale

johananl avatar May 02 '24 06:05 johananl

waite this feature)

FR-Solution avatar Jul 03 '24 16:07 FR-Solution

FYI We currently don't have any maintainer for the ingnition code in the project 😓 Also, when we introduced ignition there was a general commitment to pay down the tech debt introduced by this choice before making further improvements in this area, but also for this effort there are no volunteers showing up (and it is a few years now)

Not really sure about what will be the next steps here (even if we keep the issue around)

fabriziopandini avatar Jul 31 '24 13:07 fabriziopandini

Just a note, I think since I last commented here we added ignition 3.x support to CAPA. We are not users of the KubeADM provider but may be able to help with implementation guidance if one of the existing users wants to work on it.

JoelSpeed avatar Jul 31 '24 15:07 JoelSpeed

but also for this effort there are no volunteers showing up (and it is a few years now)

I submitted a PR addressing this feature but was advised to stop working on it as it would add additional dependencies and a larger rewrite was ongoing in this issue: https://github.com/kubernetes-sigs/cluster-api/issues/5294

https://github.com/kubernetes-sigs/cluster-api/pull/9158#issuecomment-1676316931

bengentil avatar Aug 02 '24 14:08 bengentil

Hi, Was there any further discussion about taking this forward in slack or CAPI meeting? interested in getting this feature added to CAPI.

Karthik-K-N avatar Aug 19 '24 04:08 Karthik-K-N

Hi, since the roadmap and timeline of https://github.com/kubernetes-sigs/cluster-api/issues/5294 is not clear, would the community consider revisiting merging https://github.com/kubernetes-sigs/cluster-api/pull/9158?

Being stuck with Ignition v2 for another year or two means:

  • not being able to use any modern OS with CAPBK unless an OS provides in-place v2 to v3 conversion (like Flatcar does)
  • not possible to use kernel arguments as of 3.3.0 (and other v3 features, too)
  • using legacy Ignition v2 that is deprecated long time ago

FWIW CAPBK should switch to Ignition v3 by default and base any other redesign work on it.

defo89 avatar Sep 09 '24 13:09 defo89

FWIW CAPBK should switch to Ignition v3 by default and base any other redesign work on it.

In the past a similar decision in this area did not lead to the expected results (see https://github.com/kubernetes-sigs/cluster-api/issues/9157#issuecomment-2260590126)

Considering this, and the current bandwidth of the folks actively maintaining the project, I'm personally -1 to continue down the current path just because there are not enough people investing on this feature and in paying down the technical debt related to it.

fabriziopandini avatar Sep 10 '24 08:09 fabriziopandini

@fabriziopandini can we announce this during the CAPI meeting, and we can probably gather some people interested in this topic, and we can start a WG for maintaining ignition? I'm open and interested in maintaining it potentially along with other folks.

mcbenjemaa avatar Oct 21 '24 17:10 mcbenjemaa

@fabriziopandini Flatcar would be interested in picking up maintenance and also investigate Ignition v3 support. However, a past requirement for introducing new features like v3 support was to re-write the whole bootstrap provider for clearer separation (see https://github.com/kubernetes-sigs/cluster-api/issues/5294) . That's too much for us considering it touches quite a few non-ignition related parts of the code base. If that's not a blocker anymore we'd be happy to look into supporting the Ignition feature.

t-lo avatar Oct 24 '24 11:10 t-lo

If https://github.com/kubernetes-sigs/cluster-api/issues/5294 is no more a blocker, I'd be happy to help resuming the work in the PR https://github.com/kubernetes-sigs/cluster-api/pull/9158 or in a new PR

bengentil avatar Oct 25 '24 17:10 bengentil

Unfortunately, we are in this uncomfortable spot because we did not invested at the beginning, and I really want to get out from this situation. But adding on top of what we have today IMO is not a viable path.

I think we should get back to the original need: "I want to reuse some code from CABPK & KCP" and find better ways to address this, and most important quoting our manifest, "without increasing operational and conceptual complexity for Cluster API’s users." .

fabriziopandini avatar Oct 28 '24 09:10 fabriziopandini

Unfortunately, we are in this uncomfortable spot because we did not invested at the beginning, and I really want to get out from this situation. But adding on top of what we have today IMO is not a viable path.

I share your concerns - but we don't actually want to add on top. We just want to keep ignition support up to date. Deprecated bits (like v2 support) would be removed. This is much more maintenance work than it is adding a feature.

t-lo avatar Oct 28 '24 10:10 t-lo

When we introduced ignition there was a general commitment to pay down the tech debt introduced by this choice before making further improvements in this area, but also for this effort there are no volunteers showing up (and it is a few years now)

If no one is willing to pay down the tech debt, I'm -1 to keep going with what we have now.

I understand this might seem too much for PoV of the occasional contributor, but from the PoV of core maintainers, the amount of feature not properly taken care of is getting too much, and WRT to ignition there was prior discussions and decisions we should respect.

fabriziopandini avatar Oct 28 '24 16:10 fabriziopandini

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 26 '25 16:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 25 '25 17:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Apr 24 '25 14:04 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Apr 24 '25 14:04 k8s-ci-robot