cluster-api-provider-vsphere icon indicating copy to clipboard operation
cluster-api-provider-vsphere copied to clipboard

Conflict between NodeAntiAffinity and Failure domains

Open rikatz opened this issue 2 years ago • 3 comments
trafficstars

/kind bug

What steps did you take and what happened:

  • Enable NodeAntiAffinity feature flag
  • Deploy a (or multiple) Failure Domains and Deployment zones
  • Create a template that does not have datacenter and resourcepool fields, as it should be overridden by Failure domain clone function
  • Error will occur, as there's no way to use NodeAntiAffinity because of the lack of Datacenter that will actually be added by Override Func:
E0908 20:57:35.913906       1 clustermodule_reconciler.go:132] "capv-controller-manager/vspherecluster-controller/cluster3958-0e908982/cluster3958-0e908982: failed to create cluster module for target object" err="please specify a datacenter" name="cluster3958-0e908982"
E0908 20:57:35.914354       1 controller.go:324] "Reconciler error" err="failed to create cluster modules for: cluster3958-0e908982 please specify a datacenter" controller="vspherecluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="VSphereCluster" VSphereCluster="cluster3958-0e908982/cluster3958-0e908982" namespace="cluster3958-0e908982" name="cluster3958-0e908982" reconcileID=b2491d58-964a-4a6c-8182-f6672da54c91

What did you expect to happen: Be able to deploy a control plane on different failure domains, and with NodeAntiAffinity enabled

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api-provider-vsphere version: v1.8.1
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

I will take care of this one, just reporting here if someone sees this as well :) /assign

rikatz avatar Sep 11 '23 13:09 rikatz

Sounds logical. I just wonder if the problem is bigger than that.

If I understand correctly we create one cluster module for KCP today (with one uuid). But now we would create one cluster module per datacenter for KCP, right?

(But I assume you'll hit that case if you have Machines in multiple failure domains / datacenters once you fix the current error)

sbueringer avatar Sep 12 '23 12:09 sbueringer

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 28 '24 04:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 27 '24 05:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 28 '24 06:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 28 '24 06:03 k8s-ci-robot