vsphere-csi-driver icon indicating copy to clipboard operation
vsphere-csi-driver copied to clipboard

Remove dependency on control plane nodes

Open CmdrSharp opened this issue 1 year ago • 5 comments

Is this a BUG REPORT or FEATURE REQUEST?: Feature Request

/kind feature

Suggestion: Deploying the CSI Driver currently requires the control plane nodes to be schedulable and run the kubelet. This is not always the case, such as with k0s where by default the control plane is completely isolated and does not run a kubelet.

The suggestion is to either re-design or add an option to remove the dependency on control plane nodes; instead scheduling the csi-controller onto worker nodes.

CmdrSharp avatar Dec 24 '23 12:12 CmdrSharp

@CmdrSharp can we schedule CSI controller on the control plane nodes, but make sure we don't run CSI Node Daemonsets on the control plane node. with that approach, we can avoid control plane nodes for calculating shared datastore for volume provisioning.

For security reason it is not good to schedule CSI controller on the worker nodes, as CSI controller requires vCenter credentials and we want to make sure they are only used within control plane nodes.

Also we may not have network configuration which allow connection to vCenter server from all worker nodes.

cc: @xing-yang @SandeepPissay

divyenpatel avatar Dec 29 '23 21:12 divyenpatel

Hi! The issue would remain for all cases where the control plane does not run a kubelet at all. There is nothing that guarantees control planes can run user workloads - although it is of course common. From a security standpoint, whilst I understand the thinking - I'd rather just not have my control plane be a schedulable entity.

Perhaps the solution here is to just document the kustomizations needed to manually override the node selectors and tolerations? If users want to keep the controller separated from the worker nodes, they can create dedicated worker nodes with custom taints to support this. The documentation could underline the recommendation to keep the controller on nodes isolated from normal workloads.

I would've happily taken a stab at this but documentation is hosted elsewhere.

CmdrSharp avatar Dec 30 '23 03:12 CmdrSharp

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 29 '24 03:03 k8s-triage-robot

/remove-lifecycle stale

CmdrSharp avatar Mar 30 '24 15:03 CmdrSharp