nfs_automount icon indicating copy to clipboard operation
nfs_automount copied to clipboard

Hanging on stale mounts

Open dgrant opened this issue 9 years ago • 2 comments

At home I have a share on 192.168.1.2 and at work that IP is reachable but there is no share on it. The script fails:

nfs_automount [2015-04-17 10:26:37-07:00]: [INFO] (dataset 5) Remote server/NFS service at '192.168.1.2' available.
nfs_automount [2015-04-17 10:26:37-07:00]: [INFO] (dataset 5) Local mount point '/mnt/david' active.

mount point:

192.168.1.2:/home/david on /mnt/david type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmax=3,acdirmin=3,acdirmax=3,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.10,local_lock=none,addr=192.168.1.2)

So that was mounted at home, and now I'm work and that share no longer exists at work, but 192.168.1.2 is reachable.

dgrant avatar Apr 17 '15 17:04 dgrant

Most likely it is hanging in check_stale.

dgrant avatar Apr 17 '15 17:04 dgrant

Yeah, it's the classic NFS hanging issue... doing ls or cd on the local mount point also hangs. I think the script needs a way to detect hanging shares without hanging, and then needs to clean them up (unmount them), etc... Could play it safe and only forcefully cleanup/unmount if there are no processes accessing that directory. We don't want programs to be killed if they are accessing that share and the share is just down temporarily, or if there is a temporary network issue. Do want to cleanup/unmount the share if there is nothing accessing... Or at the very least, the script needs to know what shares will hang if we touch them... so the script won't hang itself.

dgrant avatar Apr 17 '15 17:04 dgrant