vsphere-csi-driver
vsphere-csi-driver copied to clipboard
CNS volumes disappear and all in cluster operations fail
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened: In 2 separate kubernetes clusters we have observed failures when pods get moved and as a result the PVs have to be attached to the new node. The CNS volume object seems to just disappear, the FCD is still intact on the vsan. Nothing has triggered a deletion as far as I can see...
Events from the kubernetes side:
kubectl get event -n workload
LAST SEEN TYPE REASON OBJECT MESSAGE
26s Normal Scheduled pod/redis-754fbf4bd-26lbf Successfully assigned workload/redis-754fbf4bd-26lbf to worker3
11s Warning FailedAttachVolume pod/redis-754fbf4bd-26lbf AttachVolume.Attach failed for volume "pvc-202662bf-3ce7-40a0-96af-d22d58198dce" : rpc error: code = Internal desc = failed to attach disk: "eb05ab8b-dcd0-4217-9c34-3ec8bda666a9" with node: "worker3" err ServerFaultCode: Received SOAP response fault from [<cs p:00007fa4dc0a6290, TCP:localhost:443>]: retrieveVStorageObject
Logs from csi-controller:
2021-09-17 12:27:36
I0917 09:27:36.946521 1 reflector.go:381] k8s.io/client-go/informers/factory.go:134: forcing resync
2021-09-17 12:27:36
I0917 09:27:36.340379 1 reflector.go:381] k8s.io/client-go/informers/factory.go:134: forcing resync
2021-09-17 12:27:35
I0917 09:27:35.969819 1 csi_handler.go:226] Error processing "csi-697720ade9eeaa3b9851f3276fb8a3270cda2ff287e7a44690e09c7a9b3bcfba": failed to detach: rpc error: code = Internal desc = volumeID "eb05ab8b-dcd0-4217-9c34-3ec8bda666a9" not found in QueryVolume
2021-09-17 12:27:35
I0917 09:27:35.969770 1 csi_handler.go:612] Saved detach error to "csi-697720ade9eeaa3b9851f3276fb8a3270cda2ff287e7a44690e09c7a9b3bcfba"
2021-09-17 12:27:35
I0917 09:27:35.967871 1 controller.go:158] Ignoring VolumeAttachment "csi-697720ade9eeaa3b9851f3276fb8a3270cda2ff287e7a44690e09c7a9b3bcfba" change
2021-09-17 12:27:35
I0917 09:27:35.952349 1 csi_handler.go:601] Saving detach error to "csi-697720ade9eeaa3b9851f3276fb8a3270cda2ff287e7a44690e09c7a9b3bcfba"
2021-09-17 12:27:35
{"level":"error","time":"2021-09-17T09:27:35.951799411Z","caller":"vanilla/controller.go:883","msg":"volumeID \"eb05ab8b-dcd0-4217-9c34-3ec8bda666a9\" not found in QueryVolume","TraceId":"e3830e8e-5954-433b-a777-e5623668c7b3","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:883\nsigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:937\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerUnpublishVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5200\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerUnpublishVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:141\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:88\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerUnpublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5202\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
2021-09-17 12:27:35
{"level":"info","time":"2021-09-17T09:27:35.904885335Z","caller":"vanilla/controller.go:857","msg":"ControllerUnpublishVolume: called with args {VolumeId:eb05ab8b-dcd0-4217-9c34-3ec8bda666a9 NodeId:worker3 Secrets:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"e3830e8e-5954-433b-a777-e5623668c7b3"}
2021-09-17 12:27:35
I0917 09:27:35.903737 1 csi_handler.go:715] Found NodeID worker3 in CSINode worker3
And some entries from vsanvcmgmtd:
2021-09-15T22:18:29.157Z verbose vsanvcmgmtd[10186] [vSAN@6876 sub=PyBackedMO opId=sps-Main-13723-334-8a1a] Enter vasa.NotificationManager.getAlarm, Pending: 1 (5269718e-b4b7-0511-b82b-b8f707de46d5)
2021-09-15T22:18:29.157Z verbose vsanvcmgmtd[10186] [vSAN@6876 sub=PyBackedMO opId=sps-Main-13723-334-8a1a] Exit vasa.NotificationManager.getAlarm (0 ms)
2021-09-15T22:18:38.683Z info vsanvcmgmtd[07066] [vSAN@6876 sub=AccessChecker] User CLISTER.LOCAL\[email protected] was authenticated with soap session id. 52db629f-f84a-15b2-6401-b41b25af9ec7 (52a55bd9-9ac9-d4bd-d9f0-d6145bb4f7a5)
2021-09-15T22:18:38.704Z verbose vsanvcmgmtd[07065] [vSAN@6876 sub=PyBackedMO opId=07dc8a1b] Enter vim.cns.VolumeManager.queryAll, Pending: 1 (52db629f-f84a-15b2-6401-b41b25af9ec7)
2021-09-15T22:18:38.706Z verbose vsanvcmgmtd[07065] [vSAN@6876 sub=PyBackedMO opId=07dc8a1b] Exit vim.cns.VolumeManager.queryAll (1 ms)
2021-09-15T22:18:38.731Z verbose vsanvcmgmtd[07104] [vSAN@6876 sub=PyBackedMO opId=07dc8a1c] Enter vim.cns.VolumeManager.query, Pending: 1 (52db629f-f84a-15b2-6401-b41b25af9ec7)
2021-09-15T22:18:38.885Z verbose vsanvcmgmtd[07104] [vSAN@6876 sub=PyBackedMO opId=07dc8a1c] Exit vim.cns.VolumeManager.query (154 ms)
2021-09-15T22:18:38.923Z verbose vsanvcmgmtd[10186] [vSAN@6876 sub=PyBackedMO opId=07dc8a1d] Enter vim.cns.VolumeManager.create, Pending: 1 (52db629f-f84a-15b2-6401-b41b25af9ec7)
2021-09-15T22:18:38.928Z info vsanvcmgmtd[10186] [vSAN@6876 sub=VolumeManager opId=07dc8a1d] CNS: create volume task created: task-314869
2021-09-15T22:18:38.928Z info vsanvcmgmtd[10186] [vSAN@6876 sub=VolumeManager opId=07dc8a1d] CNS: Creating volume task scheduled.
2021-09-15T22:18:38.928Z verbose vsanvcmgmtd[10186] [vSAN@6876 sub=PyBackedMO opId=07dc8a1d] Exit vim.cns.VolumeManager.create (4 ms)
2021-09-15T22:18:38.933Z info vsanvcmgmtd[25351] [vSAN@6876 sub=VolumeManager opId=07dc8a1d] CNS: Creating volume task started
2021-09-15T22:18:39.029Z error vsanvcmgmtd[25351] [vSAN@6876 sub=VolumeManager opId=07dc8a1d] CNS: backingDiskId not found: eb05ab8b-dcd0-4217-9c34-3ec8bda666a9, N3Vim5Fault8NotFound9ExceptionE(Fault cause: vim.fault.NotFound
--> )
--> [context]zKq7AVECAAAAAFnREgEVdnNhbnZjbWdtdGQAAFy7KmxpYnZtYWNvcmUuc28AAAw5GwB+uxgBNG38bGlidmltLXR5cGVzLnNvAIG7mg8BgVrlDwECtWEObGlidm1vbWkuc28AAspdEQJXXxEDHuoCbGliUHlDcHBWbW9taS5zbwACHJISAvGKEgTozgNsaWJ2c2xtLXR5cGVzLnNvAAXF3AdfY25zLnNvAAVN1gUFixQGABYkIwCSJiMA5RIrBtRzAGxpYnB0aHJlYWQuc28uMAAHzY4ObGliYy5zby42AA==[/context]
2021-09-15T22:18:39.031Z info vsanvcmgmtd[25351] [vSAN@6876 sub=VolumeManager opId=07dc8a1d] Create volume completed: task-314869
2021-09-15T22:18:39.054Z verbose vsanvcmgmtd[12908] [vSAN@6876 sub=PyBackedMO opId=07dc8a1e] Enter vim.cns.VolumeManager.create, Pending: 1 (52db629f-f84a-15b2-6401-b41b25af9ec7)
2021-09-15T22:18:39.057Z info vsanvcmgmtd[12908] [vSAN@6876 sub=VolumeManager opId=07dc8a1e] CNS: create volume task created: task-314870
2021-09-15T22:18:39.057Z info vsanvcmgmtd[12908] [vSAN@6876 sub=VolumeManager opId=07dc8a1e] CNS: Creating volume task scheduled.
2021-09-15T22:18:39.057Z verbose vsanvcmgmtd[12908] [vSAN@6876 sub=PyBackedMO opId=07dc8a1e] Exit vim.cns.VolumeManager.create (3 ms)
2021-09-15T22:18:39.062Z info vsanvcmgmtd[43408] [vSAN@6876 sub=VolumeManager opId=07dc8a1e] CNS: Creating volume task started
2021-09-15T22:18:39.158Z verbose vsanvcmgmtd[07049] [vSAN@6876 sub=PyBackedMO opId=sps-Main-13723-334-8a1f] Enter vasa.NotificationManager.getAlarm, Pending: 1 (5269718e-b4b7-0511-b82b-b8f707de46d5)
2021-09-15T22:18:39.158Z verbose vsanvcmgmtd[07049] [vSAN@6876 sub=PyBackedMO opId=sps-Main-13723-334-8a1f]
What you expected to happen:
CNS volume objects should not just disappear?
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- csi-vsphere version:
- vsphere-cloud-controller-manager version: v1.20.0
- Kubernetes version: 1.20.10
- vSphere version: 6.7u3 (6.7.0.48000)
- OS (e.g. from /etc/os-release): ubuntu 20.04
- Kernel (e.g.
uname -a
):5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
- Install tools:
- Others:
Im observing a very similar behaviour here, a PVC has just disappeared from vSphere and cant be attached now
hello, me too, I have the same behavior (CSI version 2.2.2)
When I list volumes using govc volume.ls, the volume is not present. But with the command govc disk.ls, it is..
Sometime the volume shows up and disaperrar, this is very strange.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
I also observe the similar issue. Some details from the log
- volume is attached to node1 successfully
- volume is detached to node1 successfully (but strange thing is detach is called twice consecutively and the second time it logs volume is not found)
- volume is trying to attached to node2, but failed with not found error
{"level":"info","time":"2022-04-10T17:35:41.790422844Z","caller":"vanilla/controller.go:951","msg":"ControllerPublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbae8d VolumeCapability:mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > Readonly:false Secrets:map[] VolumeContext:map[storage.kubernetes.io/csiProvisionerIdentity:1649603959450-8081-csi.vsphere.vmware.com type:vSphere CNS Block Volume] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"b7849fe8-655d-41af-83f8-d5704cc297a0"}
{"level":"info","time":"2022-04-10T17:35:42.211793124Z","caller":"volume/manager.go:632","msg":"AttachVolume: volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\", vm: \"VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\", opId: \"0cfc5ab7\"","TraceId":"b7849fe8-655d-41af-83f8-d5704cc297a0"}
{"level":"info","time":"2022-04-10T17:35:42.211860571Z","caller":"volume/manager.go:667","msg":"AttachVolume: Volume attached successfully. volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\", opId: \"0cfc5ab7\", vm: \"VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\", diskUUID: \"6000C296-2e28-3df5-d780-af6831724cc6\"","TraceId":"b7849fe8-655d-41af-83f8-d5704cc297a0"}
{"level":"info","time":"2022-04-10T17:35:42.211891383Z","caller":"vanilla/controller.go:1045","msg":"ControllerPublishVolume successful with publish context: map[diskUUID:6000c2962e283df5d780af6831724cc6 type:vSphere CNS Block Volume]","TraceId":"b7849fe8-655d-41af-83f8-d5704cc297a0"}
{"level":"info","time":"2022-04-10T17:36:31.667566148Z","caller":"vanilla/controller.go:1075","msg":"ControllerUnpublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbae8d Secrets:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"b0832a4a-28da-4c22-a869-8cacde7c7592"}
{"level":"info","time":"2022-04-10T17:36:31.928688013Z","caller":"volume/manager.go:753","msg":"DetachVolume: volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\", vm: \"VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\", opId: \"0cfc5ad6\"","TraceId":"b0832a4a-28da-4c22-a869-8cacde7c7592"}
{"level":"info","time":"2022-04-10T17:36:31.928813846Z","caller":"volume/manager.go:788","msg":"DetachVolume: Volume detached successfully. volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\", vm: \"0cfc5ad6\", opId: \"VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\"","TraceId":"b0832a4a-28da-4c22-a869-8cacde7c7592"}
{"level":"info","time":"2022-04-10T17:36:31.92890707Z","caller":"vanilla/controller.go:1159","msg":"ControllerUnpublishVolume successful for volume ID: e5dc0f1b-399b-4839-955e-18a174e4e1ea","TraceId":"b0832a4a-28da-4c22-a869-8cacde7c7592"}
{"level":"info","time":"2022-04-10T17:36:31.938332455Z","caller":"vanilla/controller.go:1075","msg":"ControllerUnpublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbae8d Secrets:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"7e6019ad-5b7e-4106-981f-a68a87501d93"}
{"level":"info","time":"2022-04-10T17:36:32.091578876Z","caller":"volume/manager.go:753","msg":"DetachVolume: volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\", vm: \"VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\", opId: \"0cfc5ada\"","TraceId":"7e6019ad-5b7e-4106-981f-a68a87501d93"}
{"level":"info","time":"2022-04-10T17:36:32.091902929Z","caller":"volume/util.go:364","msg":"Extract vimfault type: +*types.NotFound vimFault: +&{{{<nil> []}}} Fault: &{DynamicData:{} Fault:0xc000d7b980 LocalizedMessage:The object or item referred to could not be found.} from resp: +&{{} {{} } 0xc000d7b720}","TraceId":"7e6019ad-5b7e-4106-981f-a68a87501d93"}
{"level":"info","time":"2022-04-10T17:36:32.094496975Z","caller":"volume/manager.go:780","msg":"DetachVolume: volumeID: \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\" not found on vm: VirtualMachine:vm-1013713 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f473-2987-d995-719b-eb65cfc7b00b, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]. Assuming it is already detached","TraceId":"7e6019ad-5b7e-4106-981f-a68a87501d93"}
{"level":"info","time":"2022-04-10T17:36:32.094536352Z","caller":"vanilla/controller.go:1159","msg":"ControllerUnpublishVolume successful for volume ID: e5dc0f1b-399b-4839-955e-18a174e4e1ea","TraceId":"7e6019ad-5b7e-4106-981f-a68a87501d93"}
{"level":"info","time":"2022-04-10T17:40:09.190591921Z","caller":"node/manager.go:106","msg":"Successfully registered node: \"08050d6e39f9-qual-322-0afbac17\" with nodeUUID \"4204f258-985f-b7f0-0782-e9a78fe37425\"","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.190652275Z","caller":"vsphere/virtualmachine.go:147","msg":"Initiating asynchronous datacenter listing with uuid 4204f258-985f-b7f0-0782-e9a78fe37425","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.218975792Z","caller":"vsphere/datacenter.go:149","msg":"Publishing datacenter Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.219055143Z","caller":"vsphere/virtualmachine.go:184","msg":"AsyncGetAllDatacenters with uuid 4204f258-985f-b7f0-0782-e9a78fe37425 sent a dc Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.224598843Z","caller":"vsphere/virtualmachine.go:198","msg":"Found VM VirtualMachine:vm-1013755 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f258-985f-b7f0-0782-e9a78fe37425, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]] given uuid 4204f258-985f-b7f0-0782-e9a78fe37425 on DC Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.224641496Z","caller":"vsphere/virtualmachine.go:209","msg":"Returning VM VirtualMachine:vm-1013755 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f258-985f-b7f0-0782-e9a78fe37425, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]] for UUID 4204f258-985f-b7f0-0782-e9a78fe37425","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.224664302Z","caller":"node/manager.go:127","msg":"Successfully discovered node with nodeUUID 4204f258-985f-b7f0-0782-e9a78fe37425 in vm VirtualMachine:vm-1013755 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f258-985f-b7f0-0782-e9a78fe37425, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3 @ /atl-qual-vc02, VirtualCenterHost: atl-qual-vc02.anthos]]","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:09.224677411Z","caller":"node/manager.go:112","msg":"Successfully discovered node: \"08050d6e39f9-qual-322-0afbac17\" with nodeUUID \"4204f258-985f-b7f0-0782-e9a78fe37425\"","TraceId":"e9b8c1df-92a4-4c20-a570-e02e52a6e6ff"}
{"level":"info","time":"2022-04-10T17:40:35.835323129Z","caller":"vanilla/controller.go:951","msg":"ControllerPublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbac17 VolumeCapability:mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > Readonly:false Secrets:map[] VolumeContext:map[storage.kubernetes.io/csiProvisionerIdentity:1649603959450-8081-csi.vsphere.vmware.com type:vSphere CNS Block Volume] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"4040d825-2600-4c49-b02d-e89d42ccc2fe"}
{"level":"error","time":"2022-04-10T17:40:35.859169488Z","caller":"volume/manager.go:616","msg":"CNS AttachVolume failed from vCenter \"atl-qual-vc02.anthos\" with err: ServerFaultCode: CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n}","TraceId":"4040d825-2600-4c49-b02d-e89d42ccc2fe","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).AttachVolume.func1\n\t/build/pkg/common/cns-lib/volume/manager.go:616\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).AttachVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:672\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/common.AttachVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:548\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:1037\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:1050\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5632\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerPublishVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:120\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:86\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5634\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"info","time":"2022-04-10T17:40:35.861356096Z","caller":"volume/util.go:343","msg":"Extract vimfault type: +types.NotFound. SoapFault Info: +&{{http://schemas.xmlsoap.org/soap/envelope/ Fault} ServerFaultCode CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n} {{{{<nil> []}}}}} from err +ServerFaultCode: CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n}","TraceId":"4040d825-2600-4c49-b02d-e89d42ccc2fe"}
cc @xing-yang @divyenpatel
/remove-lifecycle stale
@jingxu97 What version of CSI driver are you using? What is vSphere version? Did it retry when this happened? Also what has triggered this detach from node 1 and attach to node 2?
vSphere version 7.0u3 CSI driver: 2.4.0
The test (anthos storage qualification test) is designed to move the pod to a different node and then access the same volume.
From our testing, looks 6.7u3 is ok, but 7.0 failed very often
Also another issue related to this test is since attach volume failed with "not found" error, attach_detach_controller will mark volume as uncertain, and try to detach volume when pod is deleted, however detach also failed with "not found" error
{"level":"info","time":"2022-04-10T17:44:49.039474284Z","caller":"vanilla/controller.go:951","msg":"ControllerPublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbac17 VolumeCapability:mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER > Readonly:false Secrets:map[] VolumeContext:map[storage.kubernetes.io/csiProvisionerIdentity:1649603959450-8081-csi.vsphere.vmware.com type:vSphere CNS Block Volume] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"6d250629-8008-41b7-baed-2deb2a6baca9"}
{"level":"error","time":"2022-04-10T17:44:49.072198256Z","caller":"volume/manager.go:616","msg":"CNS AttachVolume failed from vCenter \"atl-qual-vc02.anthos\" with err: ServerFaultCode: CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n}","TraceId":"6d250629-8008-41b7-baed-2deb2a6baca9","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).AttachVolume.func1\n\t/build/pkg/common/cns-lib/volume/manager.go:616\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/common/cns-lib/volume.(*defaultManager).AttachVolume\n\t/build/pkg/common/cns-lib/volume/manager.go:672\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/common.AttachVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:548\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:1037\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:1050\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5632\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerPublishVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:120\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:86\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5634\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"info","time":"2022-04-10T17:44:49.07233599Z","caller":"volume/util.go:343","msg":"Extract vimfault type: +types.NotFound. SoapFault Info: +&{{http://schemas.xmlsoap.org/soap/envelope/ Fault} ServerFaultCode CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n} {{{{<nil> []}}}}} from err +ServerFaultCode: CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n}","TraceId":"6d250629-8008-41b7-baed-2deb2a6baca9"}
{"level":"error","time":"2022-04-10T17:44:49.072375655Z","caller":"common/vsphereutil.go:550","msg":"failed to attach disk \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\" with VM: \"VirtualMachine:vm-1013755 [VirtualCenterHost: atl-qual-vc02.anthos, UUID: 4204f258-985f-b7f0-0782-e9a78fe37425, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: atl-qual-vc02.anthos]]\". err: ServerFaultCode: CNS: Failed to retrieve datastore for vol e5dc0f1b-399b-4839-955e-18a174e4e1ea. (vim.fault.NotFound) {\n faultCause = (vmodl.MethodFault) null, \n faultMessage = <unset>\n msg = \"The vStorageObject (vim.vslm.ID) {\n dynamicType = null,\n dynamicProperty = null,\n id = e5dc0f1b-399b-4839-955e-18a174e4e1ea\n} was not found\"\n} faultType \"vim.fault.NotFound\"","TraceId":"6d250629-8008-41b7-baed-2deb2a6baca9","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/common.AttachVolumeUtil\n\t/build/pkg/csi/service/common/vsphereutil.go:550\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:1037\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerPublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:1050\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5632\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerPublishVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:120\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:86\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerPublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5634\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
{"level":"info","time":"2022-04-10T17:47:20.066064719Z","caller":"vanilla/controller.go:1075","msg":"ControllerUnpublishVolume: called with args {VolumeId:e5dc0f1b-399b-4839-955e-18a174e4e1ea NodeId:08050d6e39f9-qual-322-0afbac17 Secrets:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"604cd65a-62ce-49f2-9859-41f6630b7b85"}
{"level":"error","time":"2022-04-10T17:47:20.084717468Z","caller":"vanilla/controller.go:1108","msg":"volumeID \"e5dc0f1b-399b-4839-955e-18a174e4e1ea\" not found in QueryVolume","TraceId":"604cd65a-62ce-49f2-9859-41f6630b7b85","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume.func1\n\t/build/pkg/csi/service/vanilla/controller.go:1108\nsigs.k8s.io/vsphere-csi-driver/v2/pkg/csi/service/vanilla.(*controller).ControllerUnpublishVolume\n\t/build/pkg/csi/service/vanilla/controller.go:1162\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerUnpublishVolume_Handler.func1\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5650\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).controllerUnpublishVolume\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:141\ngithub.com/rexray/gocsi/middleware/serialvolume.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/serialvolume/serial_volume_locker.go:88\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:178\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handle\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:218\ngithub.com/rexray/gocsi/middleware/specvalidator.(*interceptor).handleServer\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware/specvalidator/spec_validator.go:177\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi.(*StoragePlugin).injectContext\n\t/go/pkg/mod/github.com/rexray/[email protected]/middleware.go:231\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2.1.1\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:99\ngithub.com/rexray/gocsi/utils.ChainUnaryServer.func2\n\t/go/pkg/mod/github.com/rexray/[email protected]/utils/utils_middleware.go:106\ngithub.com/container-storage-interface/spec/lib/go/csi._Controller_ControllerUnpublishVolume_Handler\n\t/go/pkg/mod/github.com/container-storage-interface/[email protected]/lib/go/csi/csi.pb.go:5652\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1024\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:1313\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/[email protected]/server.go:722"}
Should this PR fix the detach issue? https://github.com/kubernetes-sigs/vsphere-csi-driver/pull/840 In this case the volume is not attached to VM so detach should succeed.
@jingxu97 I see that the error says "CNS: Failed to retrieve datastore for vol ...". Is there anything happening to the datastore? Are you doing something such as datastore evacuation?
In #840, if the VM does not exist in vCenter inventory any more, detach will be successful. If failed to check if volume is still attached, it still returns failure.
I don't think there is datastore evacuation happening.
This specific test flaky is "an existing volume should be accessible on a new node after cluster scale up" from Anthos qualification test. I think it only happened in the situation
- pod with a volume created
- delete pod
- create a new node to join the cluster
- create the pod on the new node
There are other tests that will make volume detached and attached to a different node, but only this test fail. The only difference I can see is it is trying to attach to a new node. In this case, attach failed with "Failed to retrieve datastore for vol". After timeout, it tries to detach and also failed with the same error.
@jingxu97 Can you provide more logs, i.e., complete CSI driver and syncer logs, CSI-attacher sidecar logs, and VC support bundle?
@jingxu97 Have you deployed 2 K8s clusters with the same cluster id? We have seen similar things happening in the past when there are 2 K8s clusters with the same cluster id.
vsphere-csi-controller log: https://gist.github.com/jingxu97/842b05c0452d8696d98a96e56793a048 csi-attacher log: https://gist.github.com/jingxu97/0289b2277251b31a01f75b057fd47615
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
We are facing same issue as @jingxu97
# Disk gets listed
❯ govc disk.ls -ds=XXXX-Datastore | grep 38de0cc1b7d7
3eea43aa-e554-4fc5-89ee-38de0cc1b7d7 af2db7e5-8751-11ec-aeb2-d6a8c0fcbe1c
# But not volume
❯ govc volume.ls | grep 38de0cc1b7d7 | wc -l
0
@jingxu97 - how do you recover out of such situation?
It turned out our system backend (maintenance activity) did issue deleting volumes in certain cases. So it is not an issue from vcenter side.
/remove-lifecycle stale
@xing-yang - how do we engage you / this team formally to help resolve this issue? We already have raised a vmware ticket (support request no. 22349608807). How do we get vmware to involve this team to resolve our issues?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/assign @xing-yang
@dharapvj Sorry, I missed your ping. Have you got any update on the SR 22349608807? Usually the support team will investigate first and will open a bug for the CSI/CNS team to investigate further if needed.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@dharapvj Could you recover from your situation? How could your re-register those disks as volumes?
VMware identified an issue in internal db of vcenter where one table had duplicate entry for some unknown reason.
After deleting that entry row.. everything started to work properly without any changes on k8s side
On Mon, 22 Jan 2024, 15:52 verysonglaa, @.***> wrote:
@dharapvj https://github.com/dharapvj Could you recover from your situation? How could your re-register those disks as volumes?
— Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/1274#issuecomment-1903679167, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOU7HXSGDWVIKGCKZYEGMTYPY4YFAVCNFSM5EGXKNW2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOJQGM3DOOJRGY3Q . You are receiving this because you were mentioned.Message ID: @.***>