csi-driver-iscsi iscsi csi driver fails to mount LUN in the right location of a replaced pod

iscsi csi driver fails to mount LUN in the right location of a replaced pod

Open jmrr opened this issue 8 months ago • 1 comments

What happened:

We're using a Postgresql Bitnami Helm Chart (15.1.4) to run a postgres on a microk8s v1.29 cluster. I wanted to leverage this csi driver for this db storage using an iscsi LUN and target that I created on a QNAP NAS connected over a 10GbE network.

To connect to the LUN, I created a PV + PVC like in the examples and added the PVC as primary.persistence.existingClaim value when deploying the helm chart.

This was working like a charm, at last we could move away from risky storage in the node or slower NFS. However, I replaced the pods of the chart's statefulset to increase its resources, and somehow the csi-iscsi-node didn't mount the target in the right location of the pod's volume.

The outcome (and how we realised): the new location of the volume /var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount wasn't actually a mount of the storage in the NAS, but the node root's filesystem itself! A parallel data ingestion operation consumed the node's storage degrading the node and somewhat the whole cluster as many key workloads got evicted with [DiskPressure] and a taint added to the node

Logs that we encountered:

I0530 23:43:45.773799       1 utils.go:64] GRPC request: {"target_path":"/var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount","volume_id":"iscsi-postgresql-id"}                                                  
I0530 23:43:45.773861       1 mount_linux.go:164] Detected OS without systemd                                                                                                                                                                                                                                                
W0530 23:43:45.777225       1 iscsi_util.go:95] warning: Unmount skipped because path does not exist: /var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount

The Detected OS without systemd message is equally puzzling as we're using Ubuntu 22.04 :thinking: ...

What you expected to happen:

Say original pod volume location was:

/var/snap/microk8s/common/var/lib/kubelet/pods/9cd76fee-cd41-4869-90d2-d46ffedddf68/volumes/kubernetes.io~csi/postgresql/mount -> This was actually the mount point of the filesystem used by the iscsi target.

And the new pod volume location was

/var/snap/microk8s/common/var/lib/kubelet/pods/b88fdaea-a22e-42ac-90ae-d71f927dc300/volumes/kubernetes.io~csi/postgresql/mount

I would expect the iscsi csi driver node to unmount the target in the first location and re-mount it in the second location, corresponding to the replacement pod, with no data loss.

How to reproduce it:

Create iscsi target + lun
Create pv + pvc like in the driver's examples. e.g. the PersistentVolume manifest:

kind: PersistentVolume
metadata:
  name: postgresql-pv
  labels:
    name: postgresql
spec:
  storageClassName: postgresql-sc
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  csi:
    driver: iscsi.csi.k8s.io
    volumeHandle: iscsi-postgresql-id
    volumeAttributes:
      targetPortal: "X.X.X.X"
      portals: "[]"
      iqn: "iqn.<redacted>:iscsi.csi.8136ad"
      lun: "1"
      iscsiInterface: "default"
      discoveryCHAPAuth: "true"
      sessionCHAPAuth: "false"

customise and deploy the helm chart for bitnami postgres and select the existing claim created in 2 in the values.yaml
scale the statefulset to 0 and then to 1, or kill the pod which will instruct the statefulset controller to request a new pod to the kube API.

Anything else we need to know?:

I've since removed the postgres chart, but I can still see warning: Unmount skipped because path does not exist messages in the node logs. Environment:
CSI Driver version: commit hash: 554efb1
Kubernetes version (use kubectl version): v1.29.4
OS (e.g. from /etc/os-release): Ubuntu 22.04.3 LTS
Kernel (e.g. uname -a): 5.15.0-105-generic
Install tools: open-iscsi
Others: microk8s v1.29

May 31 '24 00:05 jmrr

csi-driver-iscsi csi-driver-iscsi copied to clipboard

iscsi csi driver fails to mount LUN in the right location of a replaced pod

csi-driver-iscsi
csi-driver-iscsi copied to clipboard