kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

Backward Compatibliity broken for RWO volumes

Open humblec opened this issue 3 years ago • 6 comments

What happened?

Since start , RWO volumes are treated as Single NODE attachment which is documented here : https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes. The a/d controller in k-c-m prevent the attachment of RWO volumes to multiple nodes. however considering its not really enforcing single access to the volume, we have introduced RWOP access mode which supposed to maintain backward compatibility of RWO volumes and also help to restrict or enforce more granular control on Single Vs Multiple POD access.

https://kubernetes.io/blog/2021/09/13/read-write-once-pod-access-mode-alpha/

The CSI integration also allow the CSI driver to get rid of attacher sidecar in context if driver's controller part does not implement Controller publish and unpublish, along with the help of declaring attachrequired:false setting in the csiDriver object.

But, in absence of the attacher sidecar, RWO backward compatibility looks to be broken! That said, RWO volumes can get mounted on more than one node at same time if "attacher sidecar" is not present.

What did you expect to happen?

RWO volumes should not be mounted on more than one node even in absence of attacher sidecar. The reason being, thats what we have declared or communicated via documentation mentioned above.

How can we reproduce it (as minimally and precisely as possible)?

create/start RWO volume attached POD replicas on multiple nodes in a kubernetes 1.24 cluster

Anything else we need to know?

No response

Kubernetes version

Kubernetes 1.24

Related plugins (CNI, CSI, ...) and versions (if applicable)

CSI

Logs

$ oc -n openshift-storage get pods -l app=csi-cephfsplugin-provisioner
NAME                                            READY   STATUS    RESTARTS   AGE
csi-cephfsplugin-provisioner-86b7679859-dwdm9   4/4     Running   0          135m
csi-cephfsplugin-provisioner-86b7679859-xtpxh   4/4     Running   0          135m

1.b. grep for the name of the PVC through the logs

$ for POD in csi-cephfsplugin-provisioner-86b7679859-xtpxh csi-cephfsplugin-provisioner-86b7679859-dwdm9 ; do echo "logs for Pod ${POD}" ; oc -n openshift-storage logs -c csi-cephfsplugin ${POD} | grep pvc-test-f308473e504b4469a3b53daaa005578 ; done
logs for Pod csi-cephfsplugin-provisioner-86b7679859-xtpxh
I0809 13:41:15.854858       1 utils.go:199] ID: 14 Req-ID: pvc-3cf8d274-4207-4cb3-8c83-e73133b79748 GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-3cf8d274-4207-4cb3-8c83-e73133b79748","parameters":{"clusterID":"openshift-storage","csi.storage.k8s.io/pv/name":"pvc-3cf8d274-4207-4cb3-8c83-e73133b79748","csi.storage.k8s.io/pvc/name":"pvc-test-f308473e504b4469a3b53daaa005578","csi.storage.k8s.io/pvc/namespace":"namespace-test-87e3deb86f094e35bfe4ff03f","fsName":"ocs-storagecluster-cephfilesystem","pool":"ocs-storagecluster-cephfilesystem-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"mount_flags":["noexec"]}},"access_mode":{"mode":7}}]}
logs for Pod csi-cephfsplugin-provisioner-86b7679859-dwdm9

FOUND: pvc-3cf8d274-4207-4cb3-8c83-e73133b79748


2. find the csi-cephfsplugin Pods that mounted the PV (pvc-3cf8d274-4207-4cb3-8c83-e73133b79748)
2.a. find the csi-cephfsplugin pods

$ oc -n openshift-storage get pods -l app=csi-cephfsplugin
NAME                     READY   STATUS    RESTARTS   AGE
csi-cephfsplugin-68q87   2/2     Running   0          132m
csi-cephfsplugin-ksx9w   2/2     Running   0          132m
csi-cephfsplugin-tw64r   2/2     Running   0          132m


2.b. grep the logs for the PV name

$ for POD in csi-cephfsplugin-68q87 csi-cephfsplugin-ksx9w csi-cephfsplugin-tw64r ; do echo "logs for Pod ${POD}" ; oc -n openshift-storage logs -c csi-cephfsplugin ${POD} | grep pvc-3cf8d274-4207-4cb3-8c83-e73133b79748 ; done
logs for Pod csi-cephfsplugin-68q87
I0809 13:41:17.532502       1 omap.go:88] ID: 6 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.f0d4d514-17e8-11ed-a476-0a580a810216"): map[csi.imagename:csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216 csi.volname:pvc-3cf8d274-4207-4cb3-8c83-e73133b79748]
I0809 13:41:17.614563       1 utils.go:199] ID: 10 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/e7f0a7ded8cb6f07e6ddbe28abc400ebf727be75ef6596481caef9f113e97d16/globalmount","target_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["noexec"]}},"access_mode":{"mode":7}},"volume_context":{"clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","pool":"ocs-storagecluster-cephfilesystem-data0","storage.kubernetes.io/csiProvisionerIdentity":"1660050908373-8081-openshift-storage.cephfs.csi.ceph.com","subvolumeName":"csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216","subvolumePath":"/volumes/csi/csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216/8a427289-5306-4b91-896b-7d6299968d28"},"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216"}
I0809 13:41:17.622569       1 cephcmds.go:105] ID: 10 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 command succeeded: mount [-o bind,_netdev,noexec /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/e7f0a7ded8cb6f07e6ddbe28abc400ebf727be75ef6596481caef9f113e97d16/globalmount /var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount]
I0809 13:41:17.622586       1 nodeserver.go:467] ID: 10 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 cephfs: successfully bind-mounted volume 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 to /var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount
I0809 13:41:56.088231       1 utils.go:199] ID: 12 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:43:02.255201       1 utils.go:199] ID: 14 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:44:59.708129       1 utils.go:199] ID: 16 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:46:03.575349       1 utils.go:199] ID: 18 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:47:04.935550       1 utils.go:199] ID: 20 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:47:28.304383       1 utils.go:199] ID: 21 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 GRPC request: {"target_path":"/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount","volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216"}
I0809 13:47:28.316122       1 cephcmds.go:105] ID: 21 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 command succeeded: umount [/var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount]
I0809 13:47:28.316180       1 nodeserver.go:523] ID: 21 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 cephfs: successfully unbounded volume 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 from /var/lib/kubelet/pods/a04017df-299d-4532-9fd8-14d44d7ef608/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount
logs for Pod csi-cephfsplugin-ksx9w
I0809 13:41:23.178221       1 omap.go:88] ID: 44 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 got omap values: (pool="ocs-storagecluster-cephfilesystem-metadata", namespace="csi", name="csi.volume.f0d4d514-17e8-11ed-a476-0a580a810216"): map[csi.imagename:csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216 csi.volname:pvc-3cf8d274-4207-4cb3-8c83-e73133b79748]
I0809 13:41:23.238602       1 utils.go:199] ID: 48 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/e7f0a7ded8cb6f07e6ddbe28abc400ebf727be75ef6596481caef9f113e97d16/globalmount","target_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["noexec"]}},"access_mode":{"mode":7}},"volume_context":{"clusterID":"openshift-storage","fsName":"ocs-storagecluster-cephfilesystem","pool":"ocs-storagecluster-cephfilesystem-data0","storage.kubernetes.io/csiProvisionerIdentity":"1660050908373-8081-openshift-storage.cephfs.csi.ceph.com","subvolumeName":"csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216","subvolumePath":"/volumes/csi/csi-vol-f0d4d514-17e8-11ed-a476-0a580a810216/8a427289-5306-4b91-896b-7d6299968d28"},"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216"}
I0809 13:41:23.241279       1 cephcmds.go:105] ID: 48 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 command succeeded: mount [-o bind,_netdev,noexec /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com/e7f0a7ded8cb6f07e6ddbe28abc400ebf727be75ef6596481caef9f113e97d16/globalmount /var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount]
I0809 13:41:23.241314       1 nodeserver.go:467] ID: 48 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 cephfs: successfully bind-mounted volume 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 to /var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount
I0809 13:41:52.009793       1 utils.go:199] ID: 50 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:43:13.357503       1 utils.go:199] ID: 54 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:44:29.240535       1 utils.go:199] ID: 58 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:46:28.101368       1 utils.go:199] ID: 62 GRPC request: {"volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216","volume_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount"}
I0809 13:47:30.663240       1 utils.go:199] ID: 65 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 GRPC request: {"target_path":"/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount","volume_id":"0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216"}
I0809 13:47:30.684475       1 cephcmds.go:105] ID: 65 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 command succeeded: umount [/var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount]
I0809 13:47:30.684558       1 nodeserver.go:523] ID: 65 Req-ID: 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 cephfs: successfully unbounded volume 0001-0011-openshift-storage-0000000000000001-f0d4d514-17e8-11ed-a476-0a580a810216 from /var/lib/kubelet/pods/fc21a287-614c-4e4e-9385-950bbff6a234/volumes/kubernetes.io~csi/pvc-3cf8d274-4207-4cb3-8c83-e73133b79748/mount
logs for Pod csi-cephfsplugin-tw64r


FOUND: Pods csi-cephfsplugin-68q87 csi-cephfsplugin-ksx9w both mounted the PV at the same time

humblec avatar Aug 11 '22 10:08 humblec

/sig storage /triage accepted

humblec avatar Aug 11 '22 10:08 humblec

Cc @chrishenzie @jsafrane @msau42 @xing-yang @saad-ali

humblec avatar Aug 11 '22 10:08 humblec

this has been briefly discussed in ReadWriteOncePod implementation PR: https://github.com/kubernetes/kubernetes/pull/102028#issuecomment-870091157

jsafrane avatar Aug 11 '22 12:08 jsafrane

But, in absence of the attacher sidecar, RWO backward compatibility looks to be broken! That said, RWO volumes can get mounted on more than one node at same time if "attacher sidecar" is not present.

Not exactly, RWO enforcement in A/D controller was alway only for attachable volumes, it never worked for non-attachable ones. For example, it was always possible to use in-tree NFS volume with RWO AccessMode on multiple nodes.

jsafrane avatar Aug 11 '22 12:08 jsafrane

cc @chrishenzie

jsafrane avatar Aug 11 '22 12:08 jsafrane

But, in absence of the attacher sidecar, RWO backward compatibility looks to be broken! That said, RWO volumes can get mounted on more than one node at same time if "attacher sidecar" is not present.

Not exactly, RWO enforcement in A/D controller was alway only for attachable volumes, it never worked for non-attachable ones. For example, it was always possible to use in-tree NFS volume with RWO AccessMode on multiple nodes.

@jsafrane Agreed, however considering attacher was a required/must requirement for a CSI driver previous to skipAttach, this gives a feel of compatibility change for CSI drivers ( agree that, it was an interim path taken for CSI drivers design) which removed attacher sidecar recently by adopting skipAttach->true. Thanks!

humblec avatar Aug 11 '22 12:08 humblec

From my understanding, we're asking if we can enforce this for non-attachable volumes? e.g. during volume mount we'd need to detect this was already mounted on another node?

chrishenzie avatar Aug 11 '22 17:08 chrishenzie

From my understanding, we're asking if we can enforce this for non-attachable volumes? e.g. during volume mount we'd need to detect this was already mounted on another node?

I think we could start with scheduler. It should put all pods that use a RWO volume to a single node. Or fail if the node is not suitable (affinity, nodeSelector, taints...)

I don't think we need to update kubelet to check other nodes, IMO we do not check it for RWOP volumes either.

jsafrane avatar Aug 15 '22 15:08 jsafrane

/retitle RWO access mode should be enforced for non-attachable volumes

There is no "Backward Compatibliity broken" here, non-attachable RWO volumes always had the same (bad) behavior.

jsafrane avatar Aug 19 '22 13:08 jsafrane

BTW, this looks like a dup of https://github.com/kubernetes/kubernetes/issues/103305 /close

jsafrane avatar Aug 19 '22 13:08 jsafrane

@jsafrane: Closing this issue.

In response to this:

BTW, this looks like a dup of https://github.com/kubernetes/kubernetes/issues/103305 /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 19 '22 13:08 k8s-ci-robot