nfs-ganesha-server-and-external-provisioner icon indicating copy to clipboard operation
nfs-ganesha-server-and-external-provisioner copied to clipboard

ESTALE 116 Error

Open infinitydon opened this issue 2 years ago • 3 comments

Hi,

Currently observing issues with respect to the following scenarios:

  • When the NFS server pod restarts due to disruptions in the connectivity between the NFS server pod and the EBS backend volume.
  • When the worker-node that is hosting the NFS server pod reboots
  • When the worker-node that is hosting the NFS server pod is re-created as a result of EC2 maintenance and the NFS server pod is recreated on a fresh EC2 worker node.

In all these cases what we notice is that application PODs that are using the mounts goes into stale like the one below

user:~$ cd /mounted
-bash: cd: /mounted: Stale file handle

Currently manual intervention is always needed by restarting the application pods for them to recover.

Any idea how to resolve this?

infinitydon avatar Oct 09 '23 20:10 infinitydon

Hello @infinitydon,

Have you managed to resolve that issue? I'm experiencing the same problem as well.

renierwoo avatar Nov 27 '23 18:11 renierwoo

Hi again, adding the parameter "-device-based-fsids=false" to the deployment/statefulset image's args section solves my problem.

renierwoo avatar Nov 30 '23 23:11 renierwoo

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Feb 29 '24 00:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 30 '24 00:03 k8s-triage-robot