cloud-provider-openstack
cloud-provider-openstack copied to clipboard
[cinder-csi-plugin] PVC resize not reflecting in pod filesystem
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
kind bug
/kind feature
What happened: Edited PersistentVolumeClaim to increase its size. The PVC and underlying PV gets updated with the new size. But, exec-ing into pod and checking for disk space($ df -h) shows the old value. And the only way to get the file system size updated is to login to the node and manually do resize2fs for that specific block.
What you expected to happen: Size needs to get updated with new value.
How to reproduce it:
- Create a PVC.
- Exec into pod and check size.
- Edit PVC with new larger size.
- Exec into pod and check size. It will show the old value.
Anything else we need to know?: This seems to happen only when using cinder csi storage backend in my cluster. Have a few other types of storage classes in the same cluster and they all seem to work fine. Could see a lot of similar issues across github and redhat's bug pages and some of them seem to have fixes pushed in at some point but the issue is still present.
Environment:
- openstack-cloud-controller-manager version: 1.25.3
- OpenStack version: 3.18.0
- cinder-csi-plugin version1.25.3
- cloud_provider_openstack_version: 1.25.3
- kubernetes version: 1.24.2
this has been reported multiple times, maybe following can be helpful to you https://github.com/kubernetes/cloud-provider-openstack/issues/2059
Hi @jichenjc
I encountered a similar issue. When I edited PersistentVolumeClaim to increase its size, the PVC and underlying PV gets updated with the new size. But on the node, the resize failed with an error. Here is the log:
On the node:
I0301 06:31:59.236703 1 nodeserver.go:541] NodeExpandVolume: called with args {"capacity_range":{"required_bytes":51539607552},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_id":"6ca713bc-7b3f-4485-8810-3deaacafbbd8","volume_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount"} E0301 06:31:59.308265 1 utils.go:92] [ID:902218] GRPC error: rpc error: code = Internal desc = Failed to find mount file system /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount: executable file not found in $PATH
The volume is resized OpenStack: ` :~$ openstack volume list |grep 6ca713bc-7b3f-4485-8810-3deaacafbbd8
| 6ca713bc-7b3f-4485-8810-3deaacafbbd8 | pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536 | in-use | 56 | Attached to worker-pool1 on /dev/vdg `
error: rpc error: code = Internal desc = Failed to find mount file system /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount: executable file not found in $PATH
didn't enounter this before..
executable file not found in $PATH seems telling us something not found, maybe you can check what's missing?
yes, but I don't know what file it is going to execute. I have already added "resize2fs mkfs' on the node, what more commands it needs to execute during the resizing process?
the failed function is
func (m *Mount) GetMountFs(volumePath string) ([]byte, error) {
args := []string{"-o", "source", "--first-only", "--noheadings", "--target", volumePath}
return m.BaseMounter.Exec.Command("findmnt", args...).CombinedOutput()
}
maybe you can check from here.. or add some logs?
The log loop output:
I0301 06:31:58.221974 1 nodeserver.go:497] NodeGetVolumeStats: called with args {"volume_id":"4539b22a-726e-42dc-8d4a-f4f4b9e2c325","volume_path":"/var/lib/kubelet/pods/dbb4b4cd-1c2a-476f-9502-4a1cd6bc3bd2/volumes/kubernetes.io~csi/pv-d7c1f72a-b90f-4242-9ba3-0f74a470cc84/mount"} I0301 06:31:58.284863 1 utils.go:88] [ID:902211] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:58.305252 1 utils.go:88] [ID:902212] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:58.306339 1 utils.go:88] [ID:902213] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:58.315955 1 utils.go:88] [ID:902214] GRPC call: /csi.v1.Node/NodeStageVolume I0301 06:31:58.318144 1 nodeserver.go:352] NodeStageVolume: called with args {"publish_context":{"DevicePath":"/dev/vdv"},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"storage.kubernetes.io/csiProvisionerIdentity":"1670494123279-8081-cinder.csi.openstack.org"},"volume_id":"6ca713bc-7b3f-4485-8810-3deaacafbbd8"} I0301 06:31:59.157976 1 mount.go:172] Found disk attached as "virtio-6ca713bc-7b3f-4485-8"; full devicepath: /dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8 I0301 06:31:59.162171 1 mount_linux.go:487] Attempting to determine if disk "/dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8]) I0301 06:31:59.179612 1 mount_linux.go:490] Output: "DEVNAME=/dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8\nTYPE=ext4\n" I0301 06:31:59.179634 1 mount_linux.go:376] Checking for issues with fsck on disk: /dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8 I0301 06:31:59.209594 1 mount_linux.go:477] Attempting to mount disk /dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8 in ext4 format at /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount I0301 06:31:59.209644 1 mount_linux.go:183] Mounting cmd (mount) with arguments (-t ext4 -o defaults /dev/disk/by-id/virtio-6ca713bc-7b3f-4485-8 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount) I0301 06:31:59.223029 1 utils.go:88] [ID:902215] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.223949 1 utils.go:88] [ID:902216] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.224606 1 utils.go:88] [ID:902217] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.236676 1 utils.go:88] [ID:902218] GRPC call: /csi.v1.Node/NodeExpandVolume I0301 06:31:59.236703 1 nodeserver.go:541] NodeExpandVolume: called with args {"capacity_range":{"required_bytes":51539607552},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_id":"6ca713bc-7b3f-4485-8810-3deaacafbbd8","volume_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount"} E0301 06:31:59.308265 1 utils.go:92] [ID:902218] GRPC error: rpc error: code = Internal desc = Failed to find mount file system /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount: executable file not found in $PATH I0301 06:31:59.900837 1 utils.go:88] [ID:902219] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.925454 1 utils.go:88] [ID:902220] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.926046 1 utils.go:88] [ID:902221] GRPC call: /csi.v1.Node/NodeGetCapabilities I0301 06:31:59.926603 1 utils.go:88] [ID:902222] GRPC call: /csi.v1.Node/NodeStageVolume
I resized the pvc from 32G to 56G
~ # df -h |grep pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536
/dev/vdg 32G 32G 0 100% /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount
:~ # fdisk /dev/vdg
Welcome to fdisk (util-linux 2.36.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
The device contains 'ext4' signature and it will be removed by a write command. See fdisk(8) man page and --wipe option for more details.
Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0x45ae7994.
Command (m for help): p
Disk /dev/vdg: 56 GiB, 60129542144 bytes, 117440512 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x45ae7994
Command (m for help): F
Unpartitioned space /dev/vdg: 56 GiB, 60128493568 bytes, 117438464 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
Start End Sectors Size
2048 117440511 117438464 56G
this has been reported multiple times, maybe following can be helpful to you #2059
Hi @jichenjc I followed that troubleshooting guide and also looked at other similar issues which hinted that it could be an openstack environment problem, and tried going through the openstack logs but couldn't find any related error messages.
I then went through the cinder csi related container logs and found one specific line
I0301 10:32:15.767556 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"lcm-container-registry", UID:"a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d", APIVersion:"v1", ResourceVersion:"7493038", FieldPath:""}): type: 'Normal' reason: 'FileSystemResizeRequired' Require file system resize of volume on node
Does this mean that we need to manually resize this everytime? Although, I couldn't see this as an error in the logs and events of the pod to which this pvc is tied to.
I tracked a few more related logs from around the same time across csi cinder related pods:
$ kubectl logs -n kube-system csi-cinder-controllerplugin-7b66gg465d-kl8n4 csi-resizer
I0301 10:32:14.282683 1 controller.go:291] Started PVC processing "kube-system/lcm-container-registry"
I0301 10:32:14.317253 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"lcm-container-registry", UID:"a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d", APIVersion:"v1", ResourceVersion:"7493038", FieldPath:""}): type: 'Normal' reason: 'Resizing' External resizer is resizing volume k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d
I0301 10:32:15.751642 1 controller.go:468] Resize volume succeeded for volume "k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d", start to update PV's capacity
I0301 10:32:15.751674 1 controller.go:570] Resize volume succeeded for volume "k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d", start to update PV's capacity
I0301 10:32:15.761160 1 controller.go:474] Update capacity of PV "k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d" to 24Gi succeeded
I0301 10:32:15.767483 1 controller.go:496] Mark PVC "kube-system/lcm-container-registry" as file system resize required
I0301 10:32:15.767527 1 controller.go:291] Started PVC processing "kube-system/lcm-container-registry"
I0301 10:32:15.767539 1 controller.go:338] No need to resize PV "k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d"
I0301 10:32:15.767556 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"lcm-container-registry", UID:"a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d", APIVersion:"v1", ResourceVersion:"7493038", FieldPath:""}): type: 'Normal' reason: 'FileSystemResizeRequired' Require file system resize of volume on node
$ kubectl logs -n kube-system csi-cinder-controllerplugin-7b66gg465d-kl8n4 cinder-csi-plugin
I0301 10:32:12.620718 1 utils.go:88] [ID:51588] GRPC call: /csi.v1.Controller/ListVolumes
I0301 10:32:14.328805 1 utils.go:88] [ID:51589] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0301 10:32:14.331517 1 utils.go:88] [ID:51590] GRPC call: /csi.v1.Controller/ControllerExpandVolume
I0301 10:32:14.337582 1 controllerserver.go:595] ControllerExpandVolume: called with args {"capacity_range":{"required_bytes":25769803776},"volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_id":"22d86e37-21d1-3125-1234-2b4e6f92e2d9"}
I0301 10:32:15.751369 1 controllerserver.go:644] ControllerExpandVolume resized volume 22d86e37-21d1-3125-1234-2b4e6f92e2d9 to size 24
$ kubectl logs csi-cinder-nodeplugin-mbwp4 -n kube-system cinder-csi-plugin
I0301 11:41:57.748435 1 utils.go:88] [ID:152499] GRPC call: /csi.v1.Node/NodeStageVolume
I0301 11:41:57.748459 1 nodeserver.go:352] NodeStageVolume: called with args {"publish_context":{"DevicePath":"/dev/vdb"},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"storage.kubernetes.io/csiProvisionerIdentity":"1676108077029-8081-cinder.csi.openstack.org"},"volume_id":"22d86e37-21d1-3125-1234-2b4e6f92e2d9"}
I0301 11:41:58.518221 1 mount.go:172] Found disk attached as "virtio-22d86e37-21d1-4589-8"; full devicepath: /dev/disk/by-id/virtio-22d86e37-21d1-4589-8
I0301 11:41:58.518279 1 mount_linux.go:487] Attempting to determine if disk "/dev/disk/by-id/virtio-22d86e37-21d1-4589-8" is formatted using blkid with args: ([-p -s TYPE -s PTTYPE -o export /dev/disk/by-id/virtio-22d86e37-21d1-4589-8])
I0301 11:41:58.534191 1 mount_linux.go:490] Output: "DEVNAME=/dev/disk/by-id/virtio-22d86e37-21d1-4589-8\nTYPE=ext4\n"
I0301 11:41:58.534215 1 mount_linux.go:376] Checking for issues with fsck on disk: /dev/disk/by-id/virtio-22d86e37-21d1-4589-8
I0301 11:41:58.562693 1 mount_linux.go:477] Attempting to mount disk /dev/disk/by-id/virtio-22d86e37-21d1-4589-8 in ext4 format at /var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount
I0301 11:41:58.562738 1 mount_linux.go:183] Mounting cmd (mount) with arguments (-t ext4 -o defaults /dev/disk/by-id/virtio-22d86e37-21d1-4589-8 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount)
I0301 11:41:58.576616 1 utils.go:88] [ID:152500] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0301 11:41:58.579632 1 utils.go:88] [ID:152501] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0301 11:41:58.580290 1 utils.go:88] [ID:152502] GRPC call: /csi.v1.Node/NodeGetCapabilities
I0301 11:41:58.581136 1 utils.go:88] [ID:152503] GRPC call: /csi.v1.Node/NodePublishVolume
I0301 11:41:58.581150 1 nodeserver.go:51] NodePublishVolume: called with args {"publish_context":{"DevicePath":"/dev/vdb"},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount","target_path":"/var/lib/kubelet/pods/b04d239f-b1c5-4ac7-8c3b-04ce8257f5f5/volumes/kubernetes.io~csi/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"lcm-container-registry-registry-6d4f6c46bd-4hrpw","csi.storage.k8s.io/pod.namespace":"kube-system","csi.storage.k8s.io/pod.uid":"b04d239f-b1c5-4ac7-8c3b-04ce8257f5f5","csi.storage.k8s.io/serviceAccount.name":"lcm-container-registry-registry","storage.kubernetes.io/csiProvisionerIdentity":"1676108077029-8081-cinder.csi.openstack.org"},"volume_id":"22d86e37-21d1-3125-1234-2b4e6f92e2d9"}
I0301 11:41:58.656960 1 mount_linux.go:183] Mounting cmd (mount) with arguments (-t ext4 -o bind /var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount /var/lib/kubelet/pods/b04d239f-b1c5-4ac7-8c3b-04ce8257f5f5/volumes/kubernetes.io~csi/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/mount)
I0301 11:41:58.658532 1 mount_linux.go:183] Mounting cmd (mount) with arguments (-t ext4 -o bind,remount,rw /var/lib/kubelet/plugins/kubernetes.io/csi/pv/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/globalmount /var/lib/kubelet/pods/b04d239f-b1c5-4ac7-8c3b-04ce8257f5f5/volumes/kubernetes.io~csi/k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d/mount)
@kpauljoseph can you share your PVC storage class details? e.g.
$ kubectl get sc `kubectl get pvc my-pvc -o json | jq -r '.spec.storageClassName'` -o yaml
should contain allowVolumeExpansion: true
@kayrus it's enabled
$ kubectl get storageclasses.storage.k8s.io
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
network-block (default) cinder.csi.openstack.org Delete Immediate true 21d
$ kubectl describe storageclasses.storage.k8s.io network-block
Name: network-block
IsDefaultClass: Yes
Annotations: storageclass.kubernetes.io/is-default-class=true
Provisioner: cinder.csi.openstack.org
Parameters: availability=nova,csi.storage.k8s.io/fstype=ext4
AllowVolumeExpansion: True
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
$ kubectl get sc `kubectl get pvc lcm-container-registry -n kube-system -o json | jq -r '.spec.storageClassName'` -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2023-02-09T12:35:35Z"
name: network-block
resourceVersion: "1470"
selfLink: /apis/storage.k8s.io/v1/storageclasses/network-block
uid: b31d8747-cce8-1234-5678-9d1b647b96c3
parameters:
availability: nova
csi.storage.k8s.io/fstype: ext4
provisioner: cinder.csi.openstack.org
reclaimPolicy: Delete
volumeBindingMode: Immediate
@kpauljoseph the log should report
NodeExpandVolume: called with args xxx like @Rico556's log shows ,but seems your log didn't have this
I doubt whether it's because resizer didn't call CPO due to your log
I0301 10:32:15.767539 1 controller.go:338] No need to resize PV "k8s-stack-a7effd8f-3b4f-3d86-8c5e-bf8a9bde359d
though not sure what happened behind , but technically we should not need manual resize, so I doubt something wrong in @Rico556 's env on this, that's the reason I think maybe findmnt is not installed?
Failed to find mount file system /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pv-b2f39bc2-65ac-48bc-8d64-b75e4f826536/globalmount: executable file not found in $PATH
hi @jichenjc I checked and findmnt is already installed.
:~ # findmnt --help
Usage:
findmnt [options]
findmnt [options] <device> | <mountpoint>
findmnt [options] <device> <mountpoint>
findmnt [options] [--source <device>] [--target <path> | --mountpoint <dir>]
Find a (mounted) filesystem.
hm. I wonder whether this test is somehow related to this issue: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/cloud-provider-openstack/2140/openstack-cloud-csi-cinder-e2e-test-release-125/1631575594753855488
while waiting for fs resize to finish: error waiting for pvc "inline-volume-tester-2vcb9-my-volume-0" filesystem resize to finish: timed out waiting for the condition
filesystem resize to finish: timed out waiting for the condition
um.. this didn't report in above logs ,will dig this a little bit ..
Hi @jichenjc , anything update on this issue? Any WA would you suggest ?
sorry, not yet... too busy in other stuff recently , if anyone has any insight that will be helpful @kayrus @zetaab
@jichenjc Wondering any update on this issue?
#2059 might be helpful esepcailly the last 2 comments from @seanschneeweiss
Hi @jichenjc. I see that some merges are done for the ticket you mentioned above ( https://bugs.launchpad.net/charm-cinder/+bug/1939389 ). Could that fix the issue, anything may I know when the fix will be available through a release version ?
Hi @jichenjc ,
As suggested by the ticket https://bugs.launchpad.net/charm-cinder/+bug/1939389 , adding the missing nova section in cinder conf that refers to authentication, we have already used this configurations and it's still not working.
example nova section from cinder conf [nova] interface = internal auth_url = XXXX auth_type = password project_domain_id = default user_domain_id = default region_name = XXXX project_name = service username = nova password = XXXX cafile =
So, let's summarize what we have. You trigger a PVC resize, OpenStack API shows that the volume size has been increased, but the pod with the corresponding PVC doesn't show the expected capacity, right?
- Do logs contain strings starting with
NodeExpandVolume: called with args ...? - Have you tried to trigger
resize2fs /dev...directly from the pod (your pod should have be privileged)? - What does
lsblksay? - What kind of hypervisor do you use for VMs?
Answers:
So, let's summarize what we have. You trigger a PVC resize, OpenStack API shows that the volume size has been increased, but the pod with the corresponding PVC doesn't show the expected capacity, right? --> Here is what we did. Edited the PVC with new larger size from 32Gi to 48Gi ("kubectl get pvc -n kube-system eric-lcm-container-registry") . After this operation, we found pvc has been extended to 48Gi, but the file system size is still 32Gi in pod.
2-24-0-rel:~> kubectl get pvc -n kube-system eric-lcm-container-registry -oyaml | grep storage volume.beta.kubernetes.io/storage-provisioner: cinder.csi.openstack.org volume.kubernetes.io/storage-provisioner: cinder.csi.openstack.org storage: 48G storageClassName: network-block storage: 46875000Ki
2-24-0-rel: kubectl exec -it -n kube-system eric-lcm-container-registry-registry-c98794fdf-dzgxg sh kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. Defaulted container "registry" out of: registry, nginx-tls-terminator, sidecar sh-4.4$ df -h Filesystem Size Used Avail Use% Mounted on overlay 24G 11G 14G 43% / tmpfs 64M 0 64M 0% /dev tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup /dev/vda3 24G 11G 14G 43% /etc/hosts shm 64M 0 64M 0% /dev/shm tmpfs 3.8G 4.0K 3.8G 1% /etc/docker/registry /dev/vdb 32G 28K 32G 1% /var/lib/registry tmpfs 2.0G 0 2.0G 0% /proc/acpi tmpfs 2.0G 0 2.0G 0% /proc/scsi tmpfs 2.0G 0 2.0G 0% /sys/firmware sh-4.4$
After that logged into the node and found: worker-2-24-0-rel: lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sr0 11:0 1 678K 0 rom vda 252:0 0 24G 0 disk ├─vda1 │ 252:1 0 2M 0 part ├─vda2 │ 252:2 0 33M 0 part /boot/efi └─vda3 252:3 0 24G 0 part /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/sidecar-config/sidecar/0 /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/registry-config/registry/1 /opt/cni / vdb 252:16 0 32G 0 disk /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/ezghodh-2-24-0-rel-8a69de50-8823-4ede-8cc6-1a412f07bfee/registry/0 /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volumes/kubernetes.io~csi/ezghodh-2-24-0-rel-8a69de50-8823-4ede-8cc6-1a412f07bfee/mount /var/lib/kubelet/plugins/kubernetes.io/csi/cinder.csi.openstack.org/64bc60a9af910b155eb6efbcd8f07437ecf3a1b1103e42107c8a72ad302bfc7e/globalmount
Do logs contain strings starting with NodeExpandVolume: called with args ...? --> Yes, found in the node logs as below.
worker-2-24-0-rel: journalctl | grep NodeExpandVolume Jun 22 06:01:52 worker-pool1-u7zw0i3p-ezghodh-2-24-0-rel kubelet[4458]: I0622 06:01:52.621220 4458 operation_generator.go:2217] "MountVolume.NodeExpandVolume succeeded for volume "ezghodh-2-24-0-rel-8a69de50-8823-4ede-8cc6-1a412f07bfee" (UniqueName: "kubernetes.io/csi/cinder.csi.openstack.org^4b366cb8-5503-4769-80f2-0506cb7ad5f5") pod "eric-lcm-container-registry-registry-c98794fdf-dzgxg" (UID: "adf25f71-50b8-49b4-bf53-d84217c0bf44") worker-2-24-0-rel" pod="kube-system/eric-lcm-container-registry-registry-c98794fdf-dzgxg" worker-2-24-0-rel:
What does lsblk say? --> Listed above
What kind of hypervisor do you use for VMs? --> worker-2-24-0-rel: sudo dmidecode | grep -i -e manufacturer -e product -e vendor Vendor: SeaBIOS Manufacturer: OpenStack Foundation Product Name: OpenStack Nova Manufacturer: QEMU Manufacturer: QEMU Manufacturer: QEMU Manufacturer: QEMU Manufacturer: QEMU Manufacturer: QEMU
@dhiman360 can you also share the CSI driver version you're using? Is it still 1.25.3? It would be also nice to have at least --v=5 cinder CSI nodeserver logs containing NodeExpandVolume string in.
Please share the output of the kubectl get sc network-block -o yaml.
In addition can you run these two commands manually on the host node using root and see whether lsblk shows the correct size?
udevadm triggerfor i in /sys/class/scsi_host/*/scan; do echo '- - -' > $i; done
P.S. Please use markdown formatting to highlight the logs/cli output.
Hi @kayrus Don't see any changes:
2-24-0-rel: kubectl get sc network-block -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2023-03-29T16:43:59Z"
name: network-block
resourceVersion: "2536"
uid: 9d5a23a7-863e-4e63-99a2-a953fa876cfe
parameters:
availability: nova
csi.storage.k8s.io/fstype: ext4
provisioner: cinder.csi.openstack.org
reclaimPolicy: Delete
volumeBindingMode: Immediate
worker-2-24-0-rel:/home/eccd # udevadm trigger
worker-2-24-0-rel:/home/eccd # for i in /sys/class/scsi_host/*/scan; do echo '- - -' > $i; done
worker-2-24-0-rel:/home/eccd # lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sr0 11:0 1 678K 0 rom
vda 252:0 0 24G 0 disk
├─vda1
│ 252:1 0 2M 0 part
├─vda2
│ 252:2 0 33M 0 part /boot/efi
└─vda3
252:3 0 24G 0 part /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/sidecar-config/sidecar/0
/var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/registry-config/registry/1
/opt/cni
/
vdb 252:16 0 32G 0 disk /var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volume-subpaths/ezghodh-2-24-0-rel-8a69de50-8823-4ede-8cc6-1a412f07bfee/registry/0
/var/lib/kubelet/pods/adf25f71-50b8-49b4-bf53-d84217c0bf44/volumes/kubernetes.io~csi/ezghodh-2-24-0-rel-8a69de50-8823-4ede-8cc6-1a412f07bfee/mount
/var/lib/kubelet/plugins/kubernetes.io/csi/cinder.csi.openstack.org/64bc60a9af910b155eb6efbcd8f07437ecf3a1b1103e42107c8a72ad302bfc7e/globalmo
unt
@dhiman360 I'm afraid your OpenStack provider doesn't support online volume expansion. Did you have a chance to clarify this question with your OpenStack cloud admins?
@kayrus can you please provide specific data that I can share with the admin, that showin it doesn't support online volume expansion. Thanks.
I think the https://github.com/kubernetes/cloud-provider-openstack/issues/2138#issuecomment-1602230637 comment would be enough. And the output of the openstack volume show %% with the desired volume size.
@kayrus , I have done a set of tests with PVC increments. Please have a look. It seems if we increase by 8Gi every time, it's working but if the number are not multiples of 8Gi then we have issues, the behaviors are different. Also a big jump from 8Gi to 40Gi also not reflected even it is multiple of 8Gi change.
Is it something to do with the following property (sio_round_volume_capacity ) listed? https://docs.openstack.org/cinder/rocky/configuration/block-storage/samples/cinder.conf.html
Round volume sizes up to 8GB boundaries. VxFlex OS/ScaleIO requires volumes to be sized in multiples of 8GB. If set to False, volume creation will fail for volumes not sized properly (boolean value) sio_round_volume_capacity = true
@kayrus In this execution, although the journalctl log says "MountVolume.NodeExpandVolume succeeded", none of the time it got succeeded, check the lsblk from the node at the end, that also did not change. Details of the test execution:
Original:
=====
master kubectl get pvc -A | grep registry
kube-system ee-container-registry Bound dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006 10Gi RWO network-block 10h
master kubectl exec -it -n kube-system ee-container-registry-registry-7787b97786-7p5zd -- df -h
Defaulted container "registry" out of: registry, nginx-tls-terminator, sidecar
Filesystem Size Used Avail Use% Mounted on
overlay 24G 9.5G 15G 40% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/vda3 24G 9.5G 15G 40% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/vdb 16G 28K 16G 1% /var/lib/registry
tmpfs 5.0G 4.0K 5.0G 1% /etc/docker/registry
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
TRY-1: 10Gi to 24Gi -> FAILED
===================
master kubectl edit pvc -n kube-system ee-container-registry
persistentvolumeclaim/ee-container-registry edited
master kubectl get pvc -n kube-system ee-container-registry
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ee-container-registry Bound dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006 24Gi RWO network-block 10h
master kubectl exec -it -n kube-system ee-container-registry-registry-7787b97786-7p5zd -- df -h
Defaulted container "registry" out of: registry, nginx-tls-terminator, sidecar
Filesystem Size Used Avail Use% Mounted on
overlay 24G 9.5G 15G 40% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/vda3 24G 9.5G 15G 40% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/vdb 16G 28K 16G 1% /var/lib/registry <<----------- Unchanged
tmpfs 5.0G 4.0K 5.0G 1% /etc/docker/registry
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
TEY-2: 24Gi to 30Gi -> FAILED
===================
master kubectl edit pvc -n kube-system ee-container-registry
persistentvolumeclaim/ee-container-registry edited
master kubectl get pvc -n kube-system ee-container-registry
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ee-container-registry Bound dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006 30Gi RWO network-block 10h
master kubectl exec -it -n kube-system ee-container-registry-registry-7787b97786-7p5zd -- df -h
Defaulted container "registry" out of: registry, nginx-tls-terminator, sidecar
Filesystem Size Used Avail Use% Mounted on
overlay 24G 9.5G 15G 40% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/vda3 24G 9.5G 15G 40% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/vdb 16G 28K 16G 1% /var/lib/registry <<----------- Unchanged
tmpfs 5.0G 4.0K 5.0G 1% /etc/docker/registry
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
TRY-3: 30Gi to 38Gi -> FAILED
===================
master kubectl edit pvc -n kube-system ee-container-registry
persistentvolumeclaim/ee-container-registry edited
master kubectl get pvc -n kube-system ee-container-registry
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ee-container-registry Bound dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006 38Gi RWO network-block 10h
master kubectl exec -it -n kube-system ee-container-registry-registry-7787b97786-7p5zd -- df -h
Defaulted container "registry" out of: registry, nginx-tls-terminator, sidecar
Filesystem Size Used Avail Use% Mounted on
overlay 24G 9.5G 15G 40% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/vda3 24G 9.5G 15G 40% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/vdb 16G 28K 16G 1% /var/lib/registry <<----------- Unchanged
tmpfs 5.0G 4.0K 5.0G 1% /etc/docker/registry
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
lsblk at the end.
=================
master ssh 10.0.16.4 lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sr0 11:0 1 678K 0 rom
vda 252:0 0 24G 0 disk
├─vda1 252:1 0 2M 0 part
├─vda2 252:2 0 33M 0 part /boot/efi
└─vda3 252:3 0 24G 0 part /var/lib/kubelet/pods/81c25228-6130-457d-b60c-2336cbca3449/volume-subpaths/sidecar-config/sidecar/0
/var/lib/kubelet/pods/81c25228-6130-457d-b60c-2336cbca3449/volume-subpaths/registry-config/registry/1
/opt/cni
/
vdb 252:16 0 16G 0 disk /var/lib/kubelet/pods/81c25228-6130-457d-b60c-2336cbca3449/volume-subpaths/dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006/registry/0 <<----------- Unchanged
. /var/lib/kubelet/pods/81c25228-6130-457d-b60c-2336cbca3449/volumes/kubernetes.io~csi/dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006/mount
/var/lib/kubelet/plugins/kubernetes.io/csi/cinder.csi.openstack.org/51071c6dbeff83bda6423106eba6d23bcfc8d3657562f46b4db1e0e0b92640e4/globalmount
vdc 252:32 0 8G 0 disk /var/lib/kubelet/pods/663ad9b9-dc6d-4d25-b134-f23850df4181/volumes/kubernetes.io~csi/dddd-2-26-0-rc3-5ae9f1f7-5cbe-4354-9a81-68a95c1cc44f/mount
/var/lib/kubelet/plugins/kubernetes.io/csi/cinder.csi.openstack.org/904f93f9fdf99626cf577800c1536e99643e567c7cda061bd7ffb9bf2cef6cf8/globalmount
journalctl logs from the node
=============================
master ssh 10.0.16.4 journalctl | grep NodeExpandVolume
Jun 24 03:11:23 worker kubelet[4781]: I0624 03:11:23.639270 4781 operation_generator.go:2227] "MountVolume.NodeExpandVolume succeeded for volume \"dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006\" (UniqueName: \"kubernetes.io/csi/cinder.csi.openstack.org^37f48caa-f5ed-4b32-bd21-b24e1d227deb\") pod \"ee-container-registry-registry-7787b97786-7p5zd\" (UID: \"81c25228-6130-457d-b60c-2336cbca3449\") worker" pod="kube-system/ee-container-registry-registry-7787b97786-7p5zd"
Jun 24 03:12:37 worker kubelet[4781]: I0624 03:12:37.741528 4781 operation_generator.go:2227] "MountVolume.NodeExpandVolume succeeded for volume \"dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006\" (UniqueName: \"kubernetes.io/csi/cinder.csi.openstack.org^37f48caa-f5ed-4b32-bd21-b24e1d227deb\") pod \"ee-container-registry-registry-7787b97786-7p5zd\" (UID: \"81c25228-6130-457d-b60c-2336cbca3449\") worker" pod="kube-system/ee-container-registry-registry-7787b97786-7p5zd"
Jun 24 03:14:03 worker kubelet[4781]: I0624 03:14:03.656787 4781 operation_generator.go:2227] "MountVolume.NodeExpandVolume succeeded for volume \"dddd-2-26-0-rc3-7190afcf-543c-4a4a-a508-4f19b0ca2006\" (UniqueName: \"kubernetes.io/csi/cinder.csi.openstack.org^37f48caa-f5ed-4b32-bd21-b24e1d227deb\") pod \"ee-container-registry-registry-7787b97786-7p5zd\" (UID: \"81c25228-6130-457d-b60c-2336cbca3449\") worker" pod="kube-system/ee-container-registry-registry-7787b97786-7p5zd"
Issue here is that the expansion code never verifies if disk was expanded. It should always validate the disk geometry is same as requested or bigger and only then return ok. Now that check is only done if rescan-on-resize is enabled. And that only works for iscsi devices.