notify
notify copied to clipboard
fsevent hangs on Mac during shutdown
System details
- OS/Platform name and version: macOS 10.13.6 on a MacBook Pro (15 Zoll, 2016)
- Rust version (if building from source):
rustc --version:rustc 1.38.0-nightly (07e0c3651 2019-07-16) - Notify version (or commit hash if building from git): 4.0.12
Hi! Users report that rust-analyzer sometimes hangs during shutdown. The stack trace points to this code:
https://github.com/passcod/notify/blob/2b1f1d4d1acc8b9738ffbe41bfe6043ba37f9431/src/fsevent.rs#L109-L111
Downstream issue (with captured stack trace): https://github.com/rust-analyzer/rust-analyzer/issues/1541
cc @killercup
- OS/Platform name and version: macOS 10.13.6 on a MacBook Pro (15 Zoll, 2016)
- Rust version: rustc 1.38.0-nightly (07e0c3651 2019-07-16)
I'm thinking this is likely because of #118. The original race was that the fsevent loop wasn't yet running, so ending it wasn't doing anything. The workaround was therefore to wait until it tells us it's waiting. Here, the loop is running (probably, given how this happens), but at the point of dropping, the fsevent loop isn't waiting for an event, so we yield until it does, but it's shutting down, so we're never going to get there.
#118 had a better solution: to use loop observers. Unfortunately I'm not a mac developer and don't really know how best to use those for this purpose. My guess is:
At startup:
- Create a loop observer firing on runloop
exitthat, idk, sets a "i'm dead" flag or sends to a channel or something. - Save that in our struct.
- Start the thread:
- Check that the observer isn't invalidated. If it is, return. (If the observer has already been invalidated, we're stopping before we're starting, which was the cause of the initial deadlock.)
- Add the observer to the loop.
- Check again.
- Run the loop.
To stop the loop:
- Check if the observer is valid. If it's not, return. (An invalid observer would mean it's fired already, that is, that the loop has already exited.)
- Check if the runloop contains the observer. If it doesn't, return. (If the observer isn't present, we're exiting either before the loop has been initialised, before the thread has run, or while it's exiting but before the observer has been invalidated. Or, because this is multi-threaded and anything is possible, the observer was invalidated in between steps 1 and 2.)
- Invalidate the observer. This is to prevent the loop from starting if we're stopping before starting.
- Check the flag/channel that the observer sets/sends. If observer has run, return.
- Tell the runloop to stop.
This has no infinite wait loops, so no hangs and no 100% CPU usage. I think it covers ~~all~~ most cases. To be extra careful, there may be cases for:
- starting the runloop for a few milliseconds only, rechecking the observers and flags, then running indefinitely.
- adding another observer on
enterthat seeds theis_runningflag so we have a better indication of whether the loop is running. - storing the thread handle and killing it after a timeout after we tell the runloop to stop.
This isn't especially hard to implement, so I can do that fairly soon. However, beforehand I'd want some kind of review of the above (or someone to say this is ridiculous and '''[[[this]]] is how to do that''') by a Rust developer familiar with CFRunLoop and mac programming (or a mac developer familiar with Rust, whichever). If you know someone... ;)
Pinging @cmyr in case they can take a look
#210 is merged and fixes at least several deadlocks that also happens on linux, so my take would be to release a new version and see whether this resolves it ?
4.0.14 is released, I'd appreciate feedback if this fixes the problem (or others..) as I can't test it on mac