blob-csi-driver icon indicating copy to clipboard operation
blob-csi-driver copied to clipboard

rm -rf failed with "Directory not empty"

Open tanvp112 opened this issue 4 months ago • 6 comments

What happened:

Create PVC and pod as usual:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azureblob-fuse-test
  namespace: kube-public
spec:
  storageClassName: azureblob-fuse-premium
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: azureblob-fuse-test
  namespace: kube-public
spec:
  volumes:
  - name: azureblob-fuse-test
    persistentVolumeClaim:
      claimName: azureblob-fuse-test
  containers:
  - name: azureblob-fuse-test
    image: busybox
    command:
    - sh
    - -c
    - sleep 3600
    volumeMounts:
    - name: azureblob-fuse-test
      mountPath: /test

Exec into the pod and make changes:

cd /test \
  && mkdir -p one two three \
  && mkdir -p one/four/five \
  && mv one/four one/six \
  && echo 'hello, world!' > one/six/hello.txt \
  && cat one/six/hello.txt \
  && mv one/six one/seven \
  && cat one/seven/hello.txt \
  && mv one/seven/hello.txt  one/seven/world.txt \
  && cat one/seven/world.txt

While the mv command did not report any failure, but the log on the node has shown:

Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [cache_policy.go (127)]: lruPolicy::DeleteItem : Failed to delete local file /mnt/<redacted>#fuse26f630d18f744
5c29d9#pvc-9790d1f1-5f02-4788-8d0a-509fbedb8e90##kube-public#/one/six
Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [cache_policy.go (127)]: lruPolicy::DeleteItem : Failed to delete local file /mnt/<redacted>#pvc-9790d1f1-5f02-4788-8d0a-509fbedb8e90##kube-public#/one/six
Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [block_blob.go (534)]: BlockBlob::getAttrUsingList : blob one/seven/world.txt does not exist

Now perform rm -rf /test/* will fail with rm: can't remove 'one': Directory not empty.

Check using ls -a /test/one and on the Portal, both ways shown the one directory has no object in it.

Now perform mv /test/one /test/two and follow by rm -rf /test/* the command succeeded!!

Repeat the test commands above again and the result is repeating.

What you expected to happen:

mv and rm -rf command should just work and the fuse log should be free of error.

How to reproduce it:

See above.

Anything else we need to know?:

IMPORTANT Use a new StorageClass with --use-adls=true with isHnsEnabled: "true" yield the same outcome.

Environment:

  • CSI Driver version: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.22.4
  • Kubernetes version (use kubectl version): 1.27.9
  • OS (e.g. from /etc/os-release): Ubuntu 22.04.4 LTS
  • Kernel (e.g. uname -a): Linux aks-system-31080464-vmss000000 5.15.0-1056-azure
  • Install tools: N/A
  • Others: N/A

tanvp112 avatar Mar 11 '24 12:03 tanvp112

@vibhansa-msft @souravgupta-msft could you take a look at this issue? thanks.

andyzhangx avatar Mar 12 '24 06:03 andyzhangx

@andyzhangx , @vibhansa-msft @souravgupta-msft, any update for this issue?

tanvp112 avatar Mar 27 '24 02:03 tanvp112

Hi @tanvp112, this is a bug in the file cache mode. We will fix this in our next release. For mitigation, can you please try block cache instead of file cache.

souravgupta-msft avatar Mar 27 '24 10:03 souravgupta-msft

Hi @tanvp112, this is a bug in the file cache mode. We will fix this in our next release. For mitigation, can you please try block cache instead of file cache.

@souravgupta-msft so --block-cache option would work? and default is file cache?

andyzhangx avatar Mar 27 '24 10:03 andyzhangx

Yes, file cache is default. This issue is with file cache where local cached directory is not getting renamed. You can use block cache to mitigate this. Let us know if the issue persists in block cache.

souravgupta-msft avatar Mar 27 '24 10:03 souravgupta-msft