local-path-provisioner icon indicating copy to clipboard operation
local-path-provisioner copied to clipboard

"File name too long" errors when deleting Minio volumes with huge paths

Open nickbp opened this issue 3 years ago • 2 comments

I had created a Minio instance in my k3s cluster and noticed that the helper-pod-delete-pvc pods were failing with errors like the following (output is shortened here and below to avoid flooding the ticket, but you get the idea):

$ kubectl logs -n kube-system helper-pod-delete-pvc-af80f49d-3070-410d-a005-db1f7b60face rm: can't stat '/mnt/data/pvc-af80f49d-3070-410d-a005-db1f7b60face_walden_storage-minio-4/.minio.sys/tmp/.trash/dcf502d7-e792-479c-bee8-c6eefcf727e4/.trash/9ab0a5f2-ee1f-47cc-aba1-77560476a473/.trash/606d5f79-2fee-43e7-8bc7-8732f25674a4/.trash/[...]/f382c611-3ef5-4111-baa3-e82a190f6325/.trash/fc2f0259-32d0-47d8-829b-bc221de52b28/.trash/84aacd24-c87a-43e9-8c92-19bc0349e4a5/.trash/8f2c7ff6-131e-42b1-94e8-bcd7f971a55a/.trash/91c134f6-870f-4ca9-a8fd-fb72f7148ab5/.trash': File name too long

It looks like the root cause is due to a lower max filename lengths in busybox.

I found that I was able to reproduce the error on the host machine via busybox rm -rf against the PVC directory:

root@pi-04:/mnt/data# busybox rm -rf pvc-af80f49d-3070-410d-a005-db1f7b60face_walden_storage-minio-4/ rm: can't stat 'pvc-af80f49d-3070-410d-a005-db1f7b60face_walden_storage-minio-4/.minio.sys/tmp/.trash/dcf502d7-e792-479c-bee8-c6eefcf727e4/.trash/[...]/f382c611-3ef5-4111-baa3-e82a190f6325/.trash/fc2f0259-32d0-47d8-829b-bc221de52b28/.trash/84aacd24-c87a-43e9-8c92-19bc0349e4a5/.trash/8f2c7ff6-131e-42b1-94e8-bcd7f971a55a/.trash/91c134f6-870f-4ca9-a8fd-fb72f7148ab5/.trash/dd2d04ea-7619-47ff-8566-50811645b0b7': File name too long

However if I use GNU rm -rf from the same shell then the same delete works fine:

root@pi-04:/mnt/data# rm -rf pvc-af80f49d-3070-410d-a005-db1f7b60face_walden_storage-minio-4/ root@pi-04:/mnt/data# echo $? 0

From this, the solution might be to replace the use of busybox images with something else? I haven't yet dug into why the path is failing on busybox specifically but that feels like the easiest solution.

Also to be clear I don't know why Minio is creating huge paths like this but I'm consistently seeing it across deployments so it seems to be "standard". In any case it'd be better if local-path-provisioner was able to successfully clean up the volumes under this scenario.


For reference the helper-pod-delete-pvc definition is as follows:

$ kubectl get pod -o yaml -n kube-system helper-pod-delete-pvc-af80f49d-3070-410d-a005-db1f7b60face
apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.projectcalico.org/containerID: 9e566deeb6d1596ba3a1ba49f2989919255fe896eea56469363ccbe07c97ad17
    cni.projectcalico.org/podIP: ""
    cni.projectcalico.org/podIPs: ""
  creationTimestamp: "2022-02-09T21:13:13Z"
  name: helper-pod-delete-pvc-af80f49d-3070-410d-a005-db1f7b60face
  namespace: kube-system
  resourceVersion: "184079915"
  uid: cd06247d-88b4-498b-b918-7f03b23b1f5c
spec:
  containers:
  - args:
    - -p
    - /mnt/data/pvc-af80f49d-3070-410d-a005-db1f7b60face_walden_storage-minio-4
    - -s
    - "1073741824"
    - -m
    - Filesystem
    command:
    - /bin/sh
    - /script/teardown
    image: rancher/mirrored-library-busybox:1.32.1
    imagePullPolicy: IfNotPresent
    name: helper-pod
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /mnt/data
      name: data
    - mountPath: /script
      name: script
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-xwbrg
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: pi-04
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: local-path-provisioner-service-account
  serviceAccountName: local-path-provisioner-service-account
  terminationGracePeriodSeconds: 30
  tolerations:
  - operator: Exists
  volumes:
  - hostPath:
      path: /mnt/data
      type: DirectoryOrCreate
    name: data
  - configMap:
      defaultMode: 420
      items:
      - key: setup
        path: setup
      - key: teardown
        path: teardown
      name: local-path-config
    name: script
  - name: kube-api-access-xwbrg
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-02-09T21:13:13Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-02-09T21:13:13Z"
    message: 'containers with unready status: [helper-pod]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-02-09T21:13:13Z"
    message: 'containers with unready status: [helper-pod]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-02-09T21:13:13Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://539fc969eb2707030a2821b84c0911099f6f414013195028ce835c1a248dae03
    image: docker.io/rancher/library-busybox:1.32.1
    imageID: docker.io/rancher/library-busybox@sha256:ec14ead228e6c28f2523ca6d866dd442b90ba64d447bf7b194a6fb34cc6174c8
    lastState: {}
    name: helper-pod
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: containerd://539fc969eb2707030a2821b84c0911099f6f414013195028ce835c1a248dae03
        exitCode: 1
        finishedAt: "2022-02-09T21:13:15Z"
        reason: Error
        startedAt: "2022-02-09T21:13:15Z"
  hostIP: 172.30.1.4
  phase: Failed
  podIP: 172.31.141.145
  podIPs:
  - ip: 172.31.141.145
  qosClass: BestEffort
  startTime: "2022-02-09T21:13:13Z"

And the teardown script is defined as follows:

$ kubectl get configmap -n kube-system local-path-config -o yaml
[...]
  teardown: |-
    #!/bin/sh
    while getopts "m:s:p:" opt
    do
        case $opt in
            p)
            absolutePath=$OPTARG
            ;;
            s)
            sizeInBytes=$OPTARG
            ;;
            m)
            volMode=$OPTARG
            ;;
        esac
    done
    rm -rf ${absolutePath}
[...]

nickbp avatar Feb 09 '22 21:02 nickbp

Seems like it's not necessarily the fault of the busybox image itself, despite getting a busybox-specific repro in the above description. I manually configured HELPER_IMAGE=library/debian:11.2-slim and am still seeing errors. It now says cannot remove rather than can't stat implying that it's indeed using GNU tools:

$ kubectl logs -n kube-system helper-pod-delete-pvc-b3bc5419-b1aa-4fc5-8396-232260d03952 rm: cannot remove '/mnt/data/pvc-b3bc5419-b1aa-4fc5-8396-232260d03952_walden_storage-minio-5/.minio.sys/tmp/.trash/62170d11-55fe-4594-a168-8fa886362e9c/.trash/1c074d65-cd2d-4260-817e-4fef2ff625a0/.trash/a766a068-4c49-4c76-8366-7fa08122c406/.trash/cda64dd2-87f6-448a-91a6-7df043eb3888/.trash/[...]/58239ddf-d9c9-463b-9205-2a8d16d9132d/.trash/32306a31-647c-4701-8352-82492bbf47ec/.trash/dcbc3939-8085-4663-8e31-b8f0df527858/.trash/8e24b082-c97e-41ec-b06d-742a03624a19/.trash/28af5f38-01bb-4fc4-83a4-8af6727353e7/.trash': File name too long

But again I was able to delete the directory just fine on the host, even switching to use the absolute path as the container is doing. It's a bit of a mystery to me what the difference is here:

root@pi-02:/home/nick# rm -rf /mnt/data/pvc-b3bc5419-b1aa-4fc5-8396-232260d03952_walden_storage-minio-5
root@pi-02:/home/nick# echo $?
0

nickbp avatar Feb 09 '22 22:02 nickbp