watcher icon indicating copy to clipboard operation
watcher copied to clipboard

Deadlock between `Watcher` and `Debounce` threads

Open bpasero opened this issue 1 year ago • 2 comments

Ever since https://github.com/parcel-bundler/watcher/commit/9b7c657df8ad1c0d020541eae0dd38b921c94fb8 landed, we have tried to update to newer versions but in VS Code events would stop at random and not recover. Now I had some time to diagnose this further and add logging into the various places where std::unique_lock<std::mutex> are used because I was suspecting a deadlock between threads.

VS Code is probably making it easier to uncover this issue though I would think it could happen otherwise too: in VS Code it is possible for the same folder being watched with different set of ignore rules, so you can end up with N events firing from fsevents for the same path.

In https://github.com/parcel-bundler/watcher/compare/master...bpasero:watcher:ben/fork-2, logging is added through out Watcher.cc and Debounce.cc, in this pattern:

  • ...LOCK 1: try to acquire a lock
  • ...LOCK 2: lock acquired
  • ...LOCK 3: lock released

Steps to Reproduce:

watcher.subscribe('/Users/bpasero/Development/Microsoft/vscode', (err, events) => {
    console.log(1, events, err);
});

watcher.subscribe('/Users/bpasero/Development/Microsoft/vscode', (err, events) => {
    console.log(2, events, err);
}, { ignore: ['aaa']});

Run above with correct paths to any folder (does not have to be vscode). Open a file from that folder and simply keep Cmd+S pressed in the editor to trigger a ton of saves (e.g. using VS Code).

=> 🐛 at one point, no more events are printed

Deadlock

With the logging in my fork, I get the following output:

Log Output
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks - LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 2
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks - LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 2
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 2
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks - LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 2
[2024-09-05 21:19:52, thread: 0x16f867000] Watcher::triggerCallbacks - LOCK 3
[2024-09-05 21:19:52, thread: 0x16f867000] Debounce::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 3
[2024-09-05 21:19:53, thread: 0x16f867000] Debounce::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f867000] Debounce::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1


You can see that 2 threads are involved here: the debounce thread and the watcher thread. Both have the chance of locking the mutex in Watcher.cc as well as Debounce.cc. If the one thread was successful in locking the one mutex while the other was successful in locking the other and then they both need the others lock to proceed, the watcher deadlocks.

Lets focus on the last log messages, separating threads with newlines:

...

[2024-09-05 21:19:53, thread: 0x16f867000] Debounce::notify(): LOCK 1

[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 1
[2024-09-05 21:19:53, thread: 0x16f8f3000] Watcher::notify(): LOCK 2

[2024-09-05 21:19:53, thread: 0x16f867000] Debounce::notify(): LOCK 2
[2024-09-05 21:19:53, thread: 0x16f867000] Watcher::triggerCallbacks() - LOCK 1

[2024-09-05 21:19:53, thread: 0x16f8f3000] Debounce::trigger(): LOCK 1

<NO MORE OUTPUT DUE TO DEADLOCK>

Threads:

  • 0x16f867000: debouncer
  • 0x16f8f3000: watcher

The deadlock is:

  • the debouncer wakes up to process the events that have accumulated, it locks the Debouncer.cc lock successfully
  • meanwhile an event is coming from the watcher and it locks the Watcher.cc lock successfully
  • the debouncer now triggers callbacks which is going into Watcher.cc that requires the lock in Watcher.cc that the other thread holds
  • the watcher wants to trigger the debouncer but needs the lock of Debouncer.cc that the other thread holds

Note: I have seen other variations of this deadlock

@devongovett I would need your advice here and would appreciate if you could have a look. I have a fix in mind that seems to help but I am not sure if its addressing all the issues. Specifically before this line:

https://github.com/parcel-bundler/watcher/blob/99cf43b739b25bbd3538506dbc4e87a202f7af84/src/Watcher.cc#L69

call lk.unlock(); so that the Watcher.cc lock is free when the debouncer thread needs it.

bpasero avatar Sep 05 '24 19:09 bpasero

Seeing how it was working in 2.1.0, I noticed a second mutex that was only used for triggerCallbacks and restoring that seems to fix the deadlock in 2.4.x 🤞

PR: https://github.com/parcel-bundler/watcher/pull/189

bpasero avatar Sep 07 '24 05:09 bpasero

I updated https://github.com/parcel-bundler/watcher/pull/189 and now have more understanding when exactly this issue happens. The fix is relatively simple.

bpasero avatar Sep 11 '24 07:09 bpasero