nfs-ganesha-server-and-external-provisioner ESTALE 116 Error

Hi,

Currently observing issues with respect to the following scenarios:

When the NFS server pod restarts due to disruptions in the connectivity between the NFS server pod and the EBS backend volume.
When the worker-node that is hosting the NFS server pod reboots
When the worker-node that is hosting the NFS server pod is re-created as a result of EC2 maintenance and the NFS server pod is recreated on a fresh EC2 worker node.

In all these cases what we notice is that application PODs that are using the mounts goes into stale like the one below

user:~$ cd /mounted
-bash: cd: /mounted: Stale file handle

Currently manual intervention is always needed by restarting the application pods for them to recover.

Any idea how to resolve this?

Oct 09 '23 20:10 infinitydon

Hello @infinitydon,

Have you managed to resolve that issue? I'm experiencing the same problem as well.

Nov 27 '23 18:11 renierwoo

Hi again, adding the parameter "-device-based-fsids=false" to the deployment/statefulset image's args section solves my problem.

Nov 30 '23 23:11 renierwoo

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 29 '24 00:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Mar 30 '24 00:03 k8s-triage-robot