blob-csi-driver rm -rf failed with "Directory not empty"

What happened:

Create PVC and pod as usual:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azureblob-fuse-test
  namespace: kube-public
spec:
  storageClassName: azureblob-fuse-premium
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: azureblob-fuse-test
  namespace: kube-public
spec:
  volumes:
  - name: azureblob-fuse-test
    persistentVolumeClaim:
      claimName: azureblob-fuse-test
  containers:
  - name: azureblob-fuse-test
    image: busybox
    command:
    - sh
    - -c
    - sleep 3600
    volumeMounts:
    - name: azureblob-fuse-test
      mountPath: /test

Exec into the pod and make changes:

cd /test \
  && mkdir -p one two three \
  && mkdir -p one/four/five \
  && mv one/four one/six \
  && echo 'hello, world!' > one/six/hello.txt \
  && cat one/six/hello.txt \
  && mv one/six one/seven \
  && cat one/seven/hello.txt \
  && mv one/seven/hello.txt  one/seven/world.txt \
  && cat one/seven/world.txt

While the mv command did not report any failure, but the log on the node has shown:

Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [cache_policy.go (127)]: lruPolicy::DeleteItem : Failed to delete local file /mnt/<redacted>#fuse26f630d18f744
5c29d9#pvc-9790d1f1-5f02-4788-8d0a-509fbedb8e90##kube-public#/one/six
Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [cache_policy.go (127)]: lruPolicy::DeleteItem : Failed to delete local file /mnt/<redacted>#pvc-9790d1f1-5f02-4788-8d0a-509fbedb8e90##kube-public#/one/six
Mar 11 11:49:37 aks-system-31080464-vmss000000 blobfuse2[371334]: LOG_ERR [block_blob.go (534)]: BlockBlob::getAttrUsingList : blob one/seven/world.txt does not exist

Now perform rm -rf /test/* will fail with rm: can't remove 'one': Directory not empty.

Check using ls -a /test/one and on the Portal, both ways shown the one directory has no object in it.

Now perform mv /test/one /test/two and follow by rm -rf /test/* the command succeeded!!

Repeat the test commands above again and the result is repeating.

What you expected to happen:

mv and rm -rf command should just work and the fuse log should be free of error.

How to reproduce it:

See above.

Anything else we need to know?:

IMPORTANT Use a new StorageClass with --use-adls=true with isHnsEnabled: "true" yield the same outcome.

Environment:

CSI Driver version: mcr.microsoft.com/oss/kubernetes-csi/blob-csi:v1.22.4
Kubernetes version (use kubectl version): 1.27.9
OS (e.g. from /etc/os-release): Ubuntu 22.04.4 LTS
Kernel (e.g. uname -a): Linux aks-system-31080464-vmss000000 5.15.0-1056-azure
Install tools: N/A
Others: N/A

Mar 11 '24 12:03 tanvp112

@vibhansa-msft @souravgupta-msft could you take a look at this issue? thanks.

Mar 12 '24 06:03 andyzhangx

@andyzhangx , @vibhansa-msft @souravgupta-msft, any update for this issue?

Mar 27 '24 02:03 tanvp112

Hi @tanvp112, this is a bug in the file cache mode. We will fix this in our next release. For mitigation, can you please try block cache instead of file cache.

Mar 27 '24 10:03 souravgupta-msft

Hi @tanvp112, this is a bug in the file cache mode. We will fix this in our next release. For mitigation, can you please try block cache instead of file cache.

@souravgupta-msft so --block-cache option would work? and default is file cache?

Mar 27 '24 10:03 andyzhangx

Yes, file cache is default. This issue is with file cache where local cached directory is not getting renamed. You can use block cache to mitigate this. Let us know if the issue persists in block cache.

Mar 27 '24 10:03 souravgupta-msft

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 25 '24 11:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jul 25 '24 11:07 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Aug 24 '24 12:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Aug 24 '24 12:08 k8s-ci-robot

blob-csi-driver blob-csi-driver copied to clipboard

rm -rf failed with "Directory not empty"

blob-csi-driver
blob-csi-driver copied to clipboard