vsphere-csi-driver
vsphere-csi-driver copied to clipboard
Bug Report - vsphere-csi-driver disabled px storage cluster
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
/kind feature
What happened: I have an OpenShift 4.12 baremetal cluster with a mix of vSphere VMs and Baremetal nodes. The baremetal nodes are configured to be portworx storage providers so I do not want the vsphere.csi.driver to interact with those specific worker nodes. I want to apply the vsphere.csi.driver only to the infrastructure nodes in my cluster -- meaning vsphere would only apply store to 3 specific nodes. When I tried to install following instructions. The vsphere.csi.driver knocked my portworx cluster offline which caused a bunch of problems.
What you expected to happen: I was expecting to be able to have multiple csi.drivers in mycluster and also be able to only apply this VMware store to the infrastructure nodes, which are sitting in vmware.
How to reproduce it (as minimally and precisely as possible): To reproduce this error i would just apply the vsphere-csi-driver configs to the cluster
Anything else we need to know?:
Environment:
- csi-vsphere version: latest
- vsphere-cloud-controller-manager version: latest
- Kubernetes version: 1.25/OpenShift 4.12
- vSphere version:
- OS (e.g. from /etc/os-release): RHCOS/OpenShift
- Kernel (e.g.
uname -a
): - Install tools: oc client
- Others:
cc: @gnufied
So I assume that - this cluster was deployed as baremetal
cluster type when deploying OCP? Because Openshift by default installs a vSphere CSI driver on all nodes in the cluster in 4.12 and it can't be disable or turned off.
Can you confirm, what kind of platform integration you chose when installing OCP?
Assuming baremetal installs - it should be possible to install vsphere driver separately and portworx drivers separately (at least in theory).
The vsphere.csi.driver knocked my portworx cluster offline which caused a bunch of problems.
Can you elaborate? Did you set node-selectors for both controller and daemonset appropriately?
So I assume that - this cluster was deployed as
baremetal
cluster type when deploying OCP? Because Openshift by default installs a vSphere CSI driver on all nodes in the cluster in 4.12 and it can't be disable or turned off.Can you confirm, what kind of platform integration you chose when installing OCP?
Assuming baremetal installs - it should be possible to install vsphere driver separately and portworx drivers separately (at least in theory).
The vsphere.csi.driver knocked my portworx cluster offline which caused a bunch of problems.
Thanks for the response.
Yes this is a baremetal
cluster type. So there is no specific platform integration.
I did use node-selectors on the daemon sets but maybe not the controller and not sure how to do that.
When I mean knocked my portworx cluster online. What I mean is that it replaced my portworx csi driver in priority and set itself as default. Somehow that disconnected the Array Blade communication from the Openshift cluster - which forced us to redeploy and request new licensing for the cluster
It is hard to say much without looking at logs and cluster configuration. I would recommend opening a ticket against Openshift and provide all details such as must-gather and then oc adm inspect
( https://docs.openshift.com/container-platform/4.13/cli_reference/openshift_cli/administrator-cli-commands.html#oc-adm-inspect ) output of both namespace in which vsphere and portworx drivers are deployed.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.