sshfs-win
sshfs-win copied to clipboard
sshfs stuck in FspFileSystemRemoveMountPoint
The sshfs-win drive works great for a while, but hangs after ~24 hours or so.
The network is somewhat stable - at least the putty session to the same server is still active (a few weeks already). But there's a chance that underlying nfs->zfs connection is experiencing some issues time to time and who-knows-what-else.
The symptoms I see on Windows side are:
-
The
net use T:
still recognizes the connection as active. But I cannotcd
there:c:\>net use T: Local name T: Remote name \\sshfs.r\user@host Resource type Disk The command completed successfully. c:\>T: Insufficient system resources exist to complete the requested service. c:\>
-
If I try to open new Windows Explorer (e.g. Win+E) it just hangs.
-
Clicking on "T:" icon in a "Save As" dialog in the Windows "Snipping Tool" results in:
[Window Title] Location is not available [Content] T:\ is not accessible. Insufficient system resources exist to complete the requested service. [OK]
-
The
sshfs.exe
is using 2 CPU cores to 100% with 2 active threads:-
winfsp-x64.dll!FspFileSystemRemoveMountPoint+0xa0
-
ntdll.dll!RtlReleaseSRWLockExclusive+0x40
ntoskrnl.exe!KeSynchronizeExecution+0x5c26 ntoskrnl.exe!KeWaitForSingleObject+0x12e6 ntoskrnl.exe!KeWaitForSingleObject+0xadb ntoskrnl.exe!KeWaitForSingleObject+0x1ff ntoskrnl.exe!ExWaitForRundownProtectionRelease+0x9fa ntoskrnl.exe!KeWaitForSingleObject+0x31cb ntoskrnl.exe!KeSynchronizeExecution+0x2e02 ntdll.dll!NtDelayExecution+0x14 KERNELBASE.dll!SleepEx+0x9a cygwin1.dll!strtosigno+0x305 cygwin1.dll!sigfillset+0xa7f5 cygwin1.dll!sigfillset+0xaa88 cygwin1.dll!_main+0x4c5 cygwin1.dll!_main+0x502 cygwin1.dll!strtosigno+0x354 cygwin1.dll!sigfillset+0xa7f5 cygwin1.dll!sigfillset+0xaa88 cygwin1.dll!_main+0x4c5 cygwin1.dll!_main+0x502 cygwin1.dll!setprogname+0x2c21 cygwin1.dll!setprogname+0x411e cygwin1.dll!setprogname+0x41d4 KERNEL32.DLL!BaseThreadInitThunk+0x14 ntdll.dll!RtlUserThreadStart+0x21
I might be completely wrong, but this looks a bit like WinFSP is trying to re-mount the disk, like discussed here, even though I've never tried to add
Recovery
DWORD yet (not sure if it's already1
by default in the version I use). -
Note that I use OSFMount v3.1 (1000) to mount the "*.img" that is stored on T:\
. AFAIU, that's the only opened file on T:\
. If I click "dismount" there, it infinitely hangs on "Notifying applications that device is being removed..." message (Not sure if that's the actual reason).
-
Moreover, if I try to kill "OSFMount.exe" process with Sysinternals Process Explorer, I get:
--------------------------- Process Explorer --------------------------- Error terminating process: Access is denied. --------------------------- OK ---------------------------
Thus, sadly, I was unable to test if this is what blocks
sshfs.exe
. -
OSFMount does not behave like that, when WinFsp is not involved. Although I cannot say I have a lot of experience with it, I tried to reproduce the "unable to read a file" condition for it. It does behave a bit weird and seem to not explicitly react to inability to read the disk image file anyhow, but there's nothing like hanging Windows Explorer.
What actually helps is restarting "WinFsp.Launcher" service. Everything immediately goes back to normal (e.g. I'm able to start Windows Explorer again). Re-mounting >net use T: \\sshfs.r\user@host
again and OSFMount works OK.
OS version and build: Windows 10 version 1803, 17134.1304 WinFsp version and build: 1.8.20304 , sshfs-win-3.5.20357-x64
I should probably try to update all these versions and try to reproduce this behavior. Hope I would have a chance to do this in a while.
Apologies for the late response.
I looked into this issue, but unfortunately I do not have a good answer for you. However I doubt that the problem is really in FspFileSystemRemoveMountPoint
: the relevant stack trace looks completely wrong to me.
I'm very sorry for such a late response as well. Thanks for looking into the issue!
-
After all, I'm now on winfsp-1.9.21096.msi, sshfs-win 3.5.20357, Windows 10 version 20H2, for a few months already. Effectively the same issue still happens time to time.
-
Now, I guess that OSFMount is only partially related to the issue. For now, when sshfs-win "gets stuck" the Windows Explorer does not hang on startup (maybe OSFMount was blocking it somehow, because sshfs-win was blocking it). And the stack is somewhat different. But the overall "sshfs-win is stuck" effect is the same.
-
I'm still not sure if AV is somewhat involved here.
I found a way to reproduce this (or, at least, somewhat similar) issue within a few minutes with 100% repeatability. I start 100 processes (Matlab parallel pool, in this case) that read images (~10MB files) in parallel from the sshfs-win drive (just ~200Mbit/s total througput). Initially, freshly started sshfs.exe process only consumes 7.6MB of RAM. Once these processes start to work, RAM usage linearly goes up together with the total "Handles" counter. Once RAM usage reaches 1060MB ("Handles" counter reaches 16711680 at this point), sshfs-win hangs and it's network activity goes to zero.
The "RAM use" tab in Process Hacker shows a large number of 1MB blocks allocated:
If I pause or stop the reader processes (just a normal stop, not killing them abruptly) before sshfs-win hags (e.g. at ~500MB RAM usage), the RAM usage does not go down. Thus, I guess this is just a leak. Not sure what affects its severity. I'm not reading too many files here, way below 100k files before it hangs.
When I copy the same files to my local HDD nothing wrong happens - the sshfs.exe RAM usage is stable ~43MB, "Handles" counter is not growing that fast (but it still seem to constantly glow). Thus, the leak severity probably depends on how application reads the file.