ceph-csi
ceph-csi copied to clipboard
ceph-csi-cephfs 3.7.0 no longer attaches the volume to the pod
Describe the bug
After updating the ceph-csi-cephfs Helm chart to 3.7.0 the CephFS PVCs no longer get mounted to the pod and the pod stays undeployed.
We have static CephFS filesystems already created in Ceph, we have manually created PVs and PVCs for these filesystems and we use these PVCs to mount the CephFS in pods.
For example:
volumeMounts:
- name: ceph-pvc-smaug
mountPath: /mnt
volumes:
- name: ceph-pvc-smaug
persistentVolumeClaim:
claimName: cephfs-read-smaug
After creating a pod with those mounts it just gets stuck in ContainerCreating
state with an event Warning FailedMount 101s kubelet Unable to attach or mount volumes: unmounted volumes=[ceph-pvc-smaug], unattached volumes=[cephfs-pvc-storage1 kube-api-access-2q26t]: timed out waiting for the condition
.
Everything was working on 3.6.2.
Environment details
- Image/version of Ceph CSI driver : 3.7.0
- Helm chart version : 3.7.0
- Kernel version : 5.15.0
- Mounter used for mounting PVC (for cephFS its
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : kernel (I guess) - Kubernetes cluster version : 1.24
- Ceph cluster version : 16.2.9
Steps to reproduce
Steps to reproduce the behavior:
- Update ceph-csi-cephfs Helm chart to 3.7.0
- Create a pod with volume backed by CephFS PVC
Actual results
PVC not mounted.
Expected behavior
Pod gets created with the mounted PVC.
Logs
If the issue is in PVC mounting please attach complete logs of below containers.
- csi-cephfsplugin logs:
I0914 07:50:20.199977 2872353 cephcsi.go:192] Driver version: v3.7.0 and Git version: 34fd27bbd1b9e0efff48805d025a1069b5dbbc53
I0914 07:50:20.200163 2872353 cephcsi.go:210] Initial PID limit is set to 154277
I0914 07:50:20.200189 2872353 cephcsi.go:216] Reconfigured PID limit to -1 (max)
I0914 07:50:20.200207 2872353 cephcsi.go:241] Starting driver type: cephfs with name: cephfs.csi.ceph.com
I0914 07:50:20.210635 2872353 volumemounter.go:79] loaded mounter: kernel
I0914 07:50:20.223808 2872353 volumemounter.go:90] loaded mounter: fuse
I0914 07:50:20.225606 2872353 mount_linux.go:283] Detected umount with safe 'not mounted' behavior
I0914 07:50:20.225841 2872353 server.go:126] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}
I0914 07:50:20.991741 2872353 utils.go:195] ID: 1 GRPC call: /csi.v1.Identity/GetPluginInfo
I0914 07:50:20.993750 2872353 utils.go:199] ID: 1 GRPC request: {}
I0914 07:50:20.993768 2872353 identityserver-default.go:39] ID: 1 Using default GetPluginInfo
I0914 07:50:20.993840 2872353 utils.go:206] ID: 1 GRPC response: {"name":"cephfs.csi.ceph.com","vendor_version":"v3.7.0"}
I0914 07:50:21.042063 2872353 utils.go:195] ID: 2 GRPC call: /csi.v1.Node/NodeGetInfo
I0914 07:50:21.042118 2872353 utils.go:199] ID: 2 GRPC request: {}
I0914 07:50:21.042128 2872353 nodeserver-default.go:51] ID: 2 Using default NodeGetInfo
I0914 07:50:21.042250 2872353 utils.go:206] ID: 2 GRPC response: {"accessible_topology":{},"node_id":"scrap-k8s-node074"}
I0914 07:51:20.396249 2872353 utils.go:195] ID: 3 GRPC call: /csi.v1.Identity/Probe
I0914 07:51:20.396313 2872353 utils.go:199] ID: 3 GRPC request: {}
I0914 07:51:20.396430 2872353 utils.go:206] ID: 3 GRPC response: {}
I0914 07:52:20.368826 2872353 utils.go:195] ID: 4 GRPC call: /csi.v1.Identity/Probe
I0914 07:52:20.368952 2872353 utils.go:199] ID: 4 GRPC request: {}
I0914 07:52:20.368995 2872353 utils.go:206] ID: 4 GRPC response: {}
I0914 07:53:20.421603 2872353 utils.go:195] ID: 5 GRPC call: /csi.v1.Identity/Probe
I0914 07:53:20.421651 2872353 utils.go:199] ID: 5 GRPC request: {}
I0914 07:53:20.421675 2872353 utils.go:206] ID: 5 GRPC response: {}
I0914 07:54:20.378598 2872353 utils.go:195] ID: 6 GRPC call: /csi.v1.Identity/Probe
I0914 07:54:20.378704 2872353 utils.go:199] ID: 6 GRPC request: {}
I0914 07:54:20.378737 2872353 utils.go:206] ID: 6 GRPC response: {}
I0914 07:54:20.410779 2872353 utils.go:195] ID: 7 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.410840 2872353 utils.go:199] ID: 7 GRPC request: {}
I0914 07:54:20.411054 2872353 utils.go:206] ID: 7 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.416108 2872353 utils.go:195] ID: 8 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.416165 2872353 utils.go:199] ID: 8 GRPC request: {}
I0914 07:54:20.416284 2872353 utils.go:206] ID: 8 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.417080 2872353 utils.go:195] ID: 9 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.417176 2872353 utils.go:199] ID: 9 GRPC request: {}
I0914 07:54:20.417294 2872353 utils.go:206] ID: 9 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.418622 2872353 utils.go:195] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC call: /csi.v1.Node/NodeStageVolume
I0914 07:54:20.418891 2872353 utils.go:199] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean,ro","rootPath":"/smaug","staticVolume":"true"},"volume_id":"data-platform-cephfs-read-smaug"}
I0914 07:54:20.419175 2872353 volumemounter.go:126] requested mounter: , chosen mounter: kernel
I0914 07:54:20.419238 2872353 nodeserver.go:187] ID: 10 Req-ID: data-platform-cephfs-read-smaug cephfs: volume data-platform-cephfs-read-smaug is already mounted to /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount, skipping
I0914 07:54:20.419271 2872353 utils.go:206] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC response: {}
I0914 07:54:20.420168 2872353 utils.go:195] ID: 11 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.420220 2872353 utils.go:199] ID: 11 GRPC request: {}
I0914 07:54:20.420343 2872353 utils.go:206] ID: 11 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.421056 2872353 utils.go:195] ID: 12 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.421105 2872353 utils.go:199] ID: 12 GRPC request: {}
I0914 07:54:20.421252 2872353 utils.go:206] ID: 12 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.422052 2872353 utils.go:195] ID: 13 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:54:20.422100 2872353 utils.go:199] ID: 13 GRPC request: {}
I0914 07:54:20.422216 2872353 utils.go:206] ID: 13 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:54:20.423125 2872353 utils.go:195] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC call: /csi.v1.Node/NodePublishVolume
I0914 07:54:20.423298 2872353 utils.go:199] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount","target_path":"/var/lib/kubelet/pods/df67067e-0b04-4630-aba2-8ab3995c775d/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean,ro","rootPath":"/smaug","staticVolume":"true"},"volume_id":"data-platform-cephfs-read-smaug"}
I0914 07:54:20.426100 2872353 cephcmds.go:105] ID: 14 Req-ID: data-platform-cephfs-read-smaug command succeeded: mount [-o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount /var/lib/kubelet/pods/df67067e-0b04-4630-aba2-8ab3995c775d/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount]
I0914 07:54:20.426127 2872353 nodeserver.go:467] ID: 14 Req-ID: data-platform-cephfs-read-smaug cephfs: successfully bind-mounted volume data-platform-cephfs-read-smaug to /var/lib/kubelet/pods/df67067e-0b04-4630-aba2-8ab3995c775d/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount
I0914 07:54:20.426175 2872353 utils.go:206] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC response: {}
I0914 07:55:13.663939 2872353 utils.go:195] ID: 15 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.664007 2872353 utils.go:199] ID: 15 GRPC request: {}
I0914 07:55:13.664119 2872353 utils.go:206] ID: 15 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.814538 2872353 utils.go:195] ID: 16 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.814586 2872353 utils.go:199] ID: 16 GRPC request: {}
I0914 07:55:13.814681 2872353 utils.go:206] ID: 16 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.815583 2872353 utils.go:195] ID: 17 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.815650 2872353 utils.go:199] ID: 17 GRPC request: {}
I0914 07:55:13.815768 2872353 utils.go:206] ID: 17 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.816558 2872353 utils.go:195] ID: 18 Req-ID: parsing-cephfs-write-workdir GRPC call: /csi.v1.Node/NodeStageVolume
I0914 07:55:13.816687 2872353 utils.go:199] ID: 18 Req-ID: parsing-cephfs-write-workdir GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/001e385c45ece6512ca7f68d86b0aac955f36c504fd1ad501397e610d0046a31/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean","rootPath":"/workdir","staticVolume":"true"},"volume_id":"parsing-cephfs-write-workdir"}
I0914 07:55:13.816869 2872353 volumemounter.go:126] requested mounter: , chosen mounter: kernel
I0914 07:55:13.816928 2872353 nodeserver.go:187] ID: 18 Req-ID: parsing-cephfs-write-workdir cephfs: volume parsing-cephfs-write-workdir is already mounted to /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/001e385c45ece6512ca7f68d86b0aac955f36c504fd1ad501397e610d0046a31/globalmount, skipping
I0914 07:55:13.816954 2872353 utils.go:206] ID: 18 Req-ID: parsing-cephfs-write-workdir GRPC response: {}
I0914 07:55:13.817718 2872353 utils.go:195] ID: 19 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.817789 2872353 utils.go:199] ID: 19 GRPC request: {}
I0914 07:55:13.817915 2872353 utils.go:206] ID: 19 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.818790 2872353 utils.go:195] ID: 20 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.818832 2872353 utils.go:199] ID: 20 GRPC request: {}
I0914 07:55:13.818927 2872353 utils.go:206] ID: 20 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.819680 2872353 utils.go:195] ID: 21 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:55:13.819720 2872353 utils.go:199] ID: 21 GRPC request: {}
I0914 07:55:13.819808 2872353 utils.go:206] ID: 21 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:55:13.820509 2872353 utils.go:195] ID: 22 Req-ID: parsing-cephfs-write-workdir GRPC call: /csi.v1.Node/NodePublishVolume
I0914 07:55:13.820633 2872353 utils.go:199] ID: 22 Req-ID: parsing-cephfs-write-workdir GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/001e385c45ece6512ca7f68d86b0aac955f36c504fd1ad501397e610d0046a31/globalmount","target_path":"/var/lib/kubelet/pods/0b09d9e7-fb01-423e-92be-8660c4367c42/volumes/kubernetes.io~csi/parsing-cephfs-write-workdir/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean","rootPath":"/workdir","staticVolume":"true"},"volume_id":"parsing-cephfs-write-workdir"}
I0914 07:55:13.823710 2872353 cephcmds.go:105] ID: 22 Req-ID: parsing-cephfs-write-workdir command succeeded: mount [-o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/001e385c45ece6512ca7f68d86b0aac955f36c504fd1ad501397e610d0046a31/globalmount /var/lib/kubelet/pods/0b09d9e7-fb01-423e-92be-8660c4367c42/volumes/kubernetes.io~csi/parsing-cephfs-write-workdir/mount]
I0914 07:55:13.823754 2872353 nodeserver.go:467] ID: 22 Req-ID: parsing-cephfs-write-workdir cephfs: successfully bind-mounted volume parsing-cephfs-write-workdir to /var/lib/kubelet/pods/0b09d9e7-fb01-423e-92be-8660c4367c42/volumes/kubernetes.io~csi/parsing-cephfs-write-workdir/mount
I0914 07:55:13.823782 2872353 utils.go:206] ID: 22 Req-ID: parsing-cephfs-write-workdir GRPC response: {}
I0914 07:55:20.422243 2872353 utils.go:195] ID: 23 GRPC call: /csi.v1.Identity/Probe
I0914 07:55:20.422324 2872353 utils.go:199] ID: 23 GRPC request: {}
I0914 07:55:20.422347 2872353 utils.go:206] ID: 23 GRPC response: {}
I0914 07:56:07.685383 2872353 utils.go:195] ID: 24 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:56:07.685453 2872353 utils.go:199] ID: 24 GRPC request: {}
I0914 07:56:07.685587 2872353 utils.go:206] ID: 24 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:56:07.686780 2872353 utils.go:195] ID: 25 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I0914 07:56:07.686913 2872353 utils.go:199] ID: 25 GRPC request: {"volume_id":"parsing-cephfs-write-workdir","volume_path":"/var/lib/kubelet/pods/0b09d9e7-fb01-423e-92be-8660c4367c42/volumes/kubernetes.io~csi/parsing-cephfs-write-workdir/mount"}
I0914 07:56:07.688022 2872353 utils.go:206] ID: 25 GRPC response: {"usage":[{"available":147697734516736,"total":513936440950784,"unit":1,"used":366238706434048},{"total":337453544,"unit":2,"used":337453545}]}
I0914 07:56:20.388033 2872353 utils.go:195] ID: 26 GRPC call: /csi.v1.Identity/Probe
I0914 07:56:20.388352 2872353 utils.go:199] ID: 26 GRPC request: {}
I0914 07:56:20.388569 2872353 utils.go:206] ID: 26 GRPC response: {}
I0914 07:57:20.422717 2872353 utils.go:195] ID: 27 GRPC call: /csi.v1.Identity/Probe
I0914 07:57:20.422767 2872353 utils.go:199] ID: 27 GRPC request: {}
I0914 07:57:20.422789 2872353 utils.go:206] ID: 27 GRPC response: {}
I0914 07:58:05.092902 2872353 utils.go:195] ID: 28 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0914 07:58:05.092981 2872353 utils.go:199] ID: 28 GRPC request: {}
I0914 07:58:05.093108 2872353 utils.go:206] ID: 28 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I0914 07:58:05.094114 2872353 utils.go:195] ID: 29 GRPC call: /csi.v1.Node/NodeGetVolumeStats
I0914 07:58:05.094174 2872353 utils.go:199] ID: 29 GRPC request: {"volume_id":"parsing-cephfs-write-workdir","volume_path":"/var/lib/kubelet/pods/0b09d9e7-fb01-423e-92be-8660c4367c42/volumes/kubernetes.io~csi/parsing-cephfs-write-workdir/mount"}
I0914 07:58:05.094918 2872353 utils.go:206] ID: 29 GRPC response: {"usage":[{"available":147693942865920,"total":513932653494272,"unit":1,"used":366238710628352},{"total":337453550,"unit":2,"used":337453551}]}
I0914 07:58:20.397671 2872353 utils.go:195] ID: 30 GRPC call: /csi.v1.Identity/Probe
I0914 07:58:20.397792 2872353 utils.go:199] ID: 30 GRPC request: {}
I0914 07:58:20.397918 2872353 utils.go:206] ID: 30 GRPC response: {}
- driver-registrar logs:
I0914 07:50:19.981828 2872319 main.go:166] Version: v2.5.1
I0914 07:50:19.981874 2872319 main.go:167] Running node-driver-registrar in mode=registration
I0914 07:50:19.984157 2872319 main.go:191] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0914 07:50:19.984181 2872319 connection.go:154] Connecting to unix:///csi/csi.sock
I0914 07:50:20.985220 2872319 main.go:198] Calling CSI driver to discover driver name
I0914 07:50:20.985254 2872319 connection.go:183] GRPC call: /csi.v1.Identity/GetPluginInfo
I0914 07:50:20.985261 2872319 connection.go:184] GRPC request: {}
I0914 07:50:20.994617 2872319 connection.go:186] GRPC response: {"name":"cephfs.csi.ceph.com","vendor_version":"v3.7.0"}
I0914 07:50:20.994731 2872319 connection.go:187] GRPC error: <nil>
I0914 07:50:20.994742 2872319 main.go:208] CSI driver name: "cephfs.csi.ceph.com"
I0914 07:50:20.994781 2872319 node_register.go:53] Starting Registration Server at: /registration/cephfs.csi.ceph.com-reg.sock
I0914 07:50:20.995049 2872319 node_register.go:62] Registration Server started at: /registration/cephfs.csi.ceph.com-reg.sock
I0914 07:50:20.995205 2872319 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I0914 07:50:21.040494 2872319 main.go:102] Received GetInfo call: &InfoRequest{}
I0914 07:50:21.040948 2872319 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/plugins/cephfs.csi.ceph.com/registration"
I0914 07:50:21.067754 2872319 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
Additional context
#3149 could this change possibly be related?
@thrashwerk , The logs don't contain information about the pvc cephfs-read-smaug
you mentioned,
did you check whether that pvc got to bound state ?
I see a few other cephfs PVC working fine though.
It's been bound for over a 100 days, it's not a new PVC. We create them manually once and re-use them.
Maybe related to this issue: https://github.com/ceph/ceph-csi/issues/3309
I can also confirm this issue — static cephfs PVs I've created by hand and working for the past few months broke following 3.7.0 update.
@geraldwuhoo Can you please provide pvc/pv and app yaml files?
Here's an example of my configuration.
PV:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2022-05-31T13:40:40Z"
name: data-platform-cephfs-read-smaug
spec:
accessModes:
- ReadOnlyMany
capacity:
storage: 1Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: cephfs-read-smaug
namespace: data-platform
csi:
driver: cephfs.csi.ceph.com
nodeStageSecretRef:
name: cephfs-client-storage-read-smaug-creds
namespace: ceph-csi-cephfs
volumeAttributes:
clusterID: 96df99f6
fsName: storage
kernelMountOptions: recover_session=clean,ro
rootPath: /smaug
staticVolume: "true"
volumeHandle: data-platform-cephfs-read-smaug
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
status:
phase: Bound
PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
creationTimestamp: "2022-05-31T13:40:40Z"
name: cephfs-read-smaug
namespace: data-platform
spec:
accessModes:
- ReadOnlyMany
resources:
requests:
storage: 1Gi
storageClassName: ""
volumeMode: Filesystem
volumeName: data-platform-cephfs-read-smaug
status:
accessModes:
- ReadOnlyMany
capacity:
storage: 1Gi
phase: Bound
@thrashwerk Thanks for the PVC/PV spec can you please also provide pod.yaml?
This is the one I used to test it specifically:
apiVersion: v1
kind: Pod
metadata:
labels:
test: yes
name: pvc-test
namespace: data-platform
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: opensuse/leap
name: opensuse
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: "1"
memory: 1Gi
volumeMounts:
- name: ceph-pvc-smaug
mountPath: /mnt
volumes:
- name: ceph-pvc-smaug
persistentVolumeClaim:
claimName: cephfs-read-smaug
I've tried updating to 3.7.1 with not much luck but I did dig a bit more and I found some interesting things.
- Updated to 3.7.1
- Created the same pod using the same old PVC
- Checked mounts on the host where the pod was scheduled
REDACTED:/smaug on /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount type ceph (ro,relatime,name=storage.read-smaug,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=storage,recover_session=clean,_netdev)
REDACTED:/smaug on /var/lib/kubelet/pods/65c2f352-0521-463e-aebd-bb1e7f14ce9c/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount type ceph (ro,relatime,name=storage.read-smaug,secret=<hidden>,fsid=00000000-0000-0000-0000-000000000000,acl,mds_namespace=storage,recover_session=clean,_netdev)
- The mounts work correctly from the host
So it seems ceph-csi creates the mounts correctly but doesn't communicate back to K8s that it's ready and the pod can now be created?
Some logs from the nodeplugin pod csi-cephfsplugin container
I1004 11:09:12.766490 392133 utils.go:195] ID: 7 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.766553 392133 utils.go:206] ID: 7 GRPC request: {}
I1004 11:09:12.766685 392133 utils.go:212] ID: 7 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.771501 392133 utils.go:195] ID: 8 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.771566 392133 utils.go:206] ID: 8 GRPC request: {}
I1004 11:09:12.771703 392133 utils.go:212] ID: 8 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.772485 392133 utils.go:195] ID: 9 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.772532 392133 utils.go:206] ID: 9 GRPC request: {}
I1004 11:09:12.772624 392133 utils.go:212] ID: 9 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.773559 392133 utils.go:195] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC call: /csi.v1.Node/NodeStageVolume
I1004 11:09:12.773813 392133 utils.go:206] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean,ro","rootPath":"/smaug","staticVolume":"true"},"volume_id":"data-platform-cephfs-read-smaug"}
I1004 11:09:12.774018 392133 volumemounter.go:126] requested mounter: , chosen mounter: kernel
I1004 11:09:12.774091 392133 nodeserver.go:247] ID: 10 Req-ID: data-platform-cephfs-read-smaug cephfs: mounting volume data-platform-cephfs-read-smaug with Ceph kernel client
I1004 11:09:12.776395 392133 cephcmds.go:105] ID: 10 Req-ID: data-platform-cephfs-read-smaug command succeeded: modprobe [ceph]
I1004 11:09:12.961296 392133 cephcmds.go:105] ID: 10 Req-ID: data-platform-cephfs-read-smaug command succeeded: mount [-t ceph REDACTED:/smaug /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount -o name=storage.read-smaug,secretfile=/tmp/csi/keys/keyfile-1721295337,mds_namespace=storage,recover_session=clean,ro,_netdev]
I1004 11:09:12.961353 392133 nodeserver.go:206] ID: 10 Req-ID: data-platform-cephfs-read-smaug cephfs: successfully mounted volume data-platform-cephfs-read-smaug to /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount
I1004 11:09:12.961406 392133 utils.go:212] ID: 10 Req-ID: data-platform-cephfs-read-smaug GRPC response: {}
I1004 11:09:12.962263 392133 utils.go:195] ID: 11 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.962319 392133 utils.go:206] ID: 11 GRPC request: {}
I1004 11:09:12.962419 392133 utils.go:212] ID: 11 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.963108 392133 utils.go:195] ID: 12 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.963152 392133 utils.go:206] ID: 12 GRPC request: {}
I1004 11:09:12.963249 392133 utils.go:212] ID: 12 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.963935 392133 utils.go:195] ID: 13 GRPC call: /csi.v1.Node/NodeGetCapabilities
I1004 11:09:12.963982 392133 utils.go:206] ID: 13 GRPC request: {}
I1004 11:09:12.964082 392133 utils.go:212] ID: 13 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}}]}
I1004 11:09:12.964945 392133 utils.go:195] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC call: /csi.v1.Node/NodePublishVolume
I1004 11:09:12.965179 392133 utils.go:206] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount","target_path":"/var/lib/kubelet/pods/65c2f352-0521-463e-aebd-bb1e7f14ce9c/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":3}},"volume_context":{"clusterID":"96df99f6-fc1a-11ea-90a4-6cb3113cb732","fsName":"storage","kernelMountOptions":"recover_session=clean,ro","rootPath":"/smaug","staticVolume":"true"},"volume_id":"data-platform-cephfs-read-smaug"}
I1004 11:09:12.968443 392133 cephcmds.go:105] ID: 14 Req-ID: data-platform-cephfs-read-smaug command succeeded: mount [-o bind,_netdev /var/lib/kubelet/plugins/kubernetes.io/csi/cephfs.csi.ceph.com/cdc6743d9f28807918659cd1a98dcd8722c76e1e504072c73d109c5ecec66fc7/globalmount /var/lib/kubelet/pods/65c2f352-0521-463e-aebd-bb1e7f14ce9c/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount]
I1004 11:09:12.968469 392133 nodeserver.go:467] ID: 14 Req-ID: data-platform-cephfs-read-smaug cephfs: successfully bind-mounted volume data-platform-cephfs-read-smaug to /var/lib/kubelet/pods/65c2f352-0521-463e-aebd-bb1e7f14ce9c/volumes/kubernetes.io~csi/data-platform-cephfs-read-smaug/mount
I1004 11:09:12.968524 392133 utils.go:212] ID: 14 Req-ID: data-platform-cephfs-read-smaug GRPC response: {}
I have tested with cephcsi v3.7.0 quay.io/cephcsi/cephcsi:v3.7.0
everything works fine for me
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
creationTimestamp: "2022-10-10T09:44:56Z"
finalizers:
- kubernetes.io/pvc-protection
name: cephfs-static-pvc
namespace: default
resourceVersion: "5362"
uid: 79e0ac7a-de40-4e60-b52d-d9e3e79930e5
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
volumeMode: Filesystem
volumeName: cephfs-static-pv
status:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
phase: Bound
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2022-10-10T09:44:51Z"
finalizers:
- kubernetes.io/pv-protection
name: cephfs-static-pv
resourceVersion: "5358"
uid: d1c849bd-7206-435f-9f5c-9247eb52b56d
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: cephfs-static-pvc
namespace: default
resourceVersion: "5356"
uid: 79e0ac7a-de40-4e60-b52d-d9e3e79930e5
csi:
driver: rook-ceph.cephfs.csi.ceph.com
nodeStageSecretRef:
name: rook-csi-cephfs-node
namespace: rook-ceph
volumeAttributes:
clusterID: rook-ceph
fsName: myfs
kernelMountOptions: recover_session=clean
rootPath: /test
staticVolume: "true"
volumeHandle: cephfs-static-pv
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
status:
phase: Bound
[🎩︎]mrajanna@fedora ceph-csi $]cat static-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: pvc-test
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: opensuse/leap
name: opensuse
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: "1"
memory: 1Gi
volumeMounts:
- name: ceph-static-pvc
mountPath: /mnt
volumes:
- name: ceph-static-pvc
persistentVolumeClaim:
claimName: cephfs-static-pvc
sh-4.4# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 28G 5.6G 21G 22% /
tmpfs 64M 0 64M 0% /dev
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
10.102.81.126:6789:/test 31G 0 31G 0% /mnt
/dev/vda1 28G 5.6G 21G 22% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 1.0G 12K 1.0G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs 2.0G 0 2.0G 0% /proc/scsi
tmpfs 2.0G 0 2.0G 0% /sys/firmware
sh-4.4# touch /mnt/a
sh-4.4# echo testing > /mnt/a
sh-4.4# cat /mnt/a
testing
sh-4.4#
I've tried completely uninstalling the old Helm release, installed 3.7.1, created a PV and PVC for a different volume in a different namespace, and created a pod using that PVC. Still didn't work.
Nodeplugin logs show that the global and pod mounts were successful and checked that they're working from the host but the pod still gets timed out waiting for the condition
.
I'm not sure where to check anymore.
I've tried completely uninstalling the old Helm release, installed 3.7.1, created a PV and PVC for a different volume in a different namespace, and created a pod using that PVC. Still didn't work.
Nodeplugin logs show that the global and pod mounts were successful and checked that they're working from the host but the pod still gets
timed out waiting for the condition
.I'm not sure where to check anymore.
kubelet Rootpath is /var/lib/kubelet
right? checking kubelet logs might help you. atleast what i can see from your nodeplugin logs is that from cephcsi the volume is attached to the requested targetPath (that is the work of cephcsi)
I experience the same issue, what debug information can I provide to help resolve the issue?
cephcsi: v3.7.2 Kubernetes: v1.24 Ceph: v17.2.4
PV is provisioned, PVC is bound.
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cephfs-pvc Bound pvc-b79ec35a-be89-452d-a85e-0e470f231fff 1Gi RWX deploi-file-storage-sc 64s
Pod does not start
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cephfs-demo-pod-2 0/1 ContainerCreating 0 2m4s
These events are in the log:
$ kubectl describe pod cephfs-demo-pod-2
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m25s default-scheduler Successfully assigned default/cephfs-demo-pod-2 to deploikubeworker1
Warning FailedAttachVolume 25s attachdetach-controller AttachVolume.Attach failed for volume "pvc-b79ec35a-be89-452d-a85e-0e470f231fff" : timed out waiting for external-attacher of cephfs.csi.ceph.com CSI driver to attach volume 0001-0024-1ea33904-4bad-11ed-b842-2d991d088396-0000000000000003-765fa4aa-55ec-11ed-b876-4698c7018e48
Warning FailedMount 22s kubelet Unable to attach or mount volumes: unmounted volumes=[mycephfs2], unattached volumes=[mycephfs2 kube-api-access-kgmcn]: timed out waiting for the condition
@martinfjohansen is your ceph cluster healthy? If yes please open an issue with the required logs. Am closing this one as we dont get the required logs as requested.
Sure, I will open a new ticket. What are the required logs?
Sure, I will open a new ticket. What are the required logs?
Just fill in the issue template. Issue template contains all the required information.
Completed? What is completed? Ignorance of the bug?
Sorry that I can't keep breaking cluster every day to get the logs.
kubelet Rootpath is /var/lib/kubelet right? checking kubelet logs might help you. atleast what i can see from your nodeplugin logs is that from cephcsi the volume is attached to the requested targetPath (that is the work of cephcsi)
@thrashwerk please check the above comment. Cephcsi's work is to mount the volume to the required targetPath asked by the kubelet, you should check kubelet logs to see what is happening which is outside of cephcsi.
I have now posted logs over in a new issue #3476